Glossary

JSON

JSON stands for JavaScript Object Notation. It’s a plain text format for structured data, built from key-value pairs and arrays, and it’s what most scraping APIs return because machines can work with it without the usual HTML cleanup mess.

Examples

A simple JSON response from a scraping API looks like this:

{
  "url": "https://example.com/product/123",
  "title": "Running Shoes",
  "price": "$79.99",
  "in_stock": true,
  "reviews": 214
}

In Python, you typically parse JSON into a normal dictionary:

import requests

response = requests.get("https://api.example.com/data")
data = response.json()

print(data["title"])
print(data["price"])

If you're scraping through ScrapeRouter, structured JSON is a lot easier to plug into a pipeline than raw HTML:

curl -X POST "https://www.scraperouter.com/api/v1/scrape/" \
  -H "Authorization: Api-Key $api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'

Practical tips

  • JSON is great when the site already exposes structured data through an API, or when your scraper returns extracted fields instead of full HTML.
  • Don’t assume every field is present: production data is messy, keys disappear, values come back as null, and types drift.
  • Store raw responses when debugging parsing issues. It saves time when something silently changes.
  • Validate shape before trusting it, especially in batch jobs:
def is_valid_product(data):
    required = ["title", "price"]
    return all(key in data and data[key] is not None for key in required)
  • Prefer JSON over hand-parsed HTML when you can. Less brittle, less cleanup, fewer weird edge cases later.

Use cases

  • Scraping pipelines: pass extracted data between services without dragging HTML around.
  • API responses: most scraping APIs return JSON because it’s easy to consume in Python, JavaScript, Go, and basically everything else.
  • Storage: save records as JSON lines or documents for batch processing, retries, and audits.
  • Monitoring changes: compare JSON fields over time, like price, stock status, or ranking, without reparsing whole pages.

Related terms

HTML API Parser Structured Data Web Scraping