JSON | ScrapeRouter

JSON stands for JavaScript Object Notation. It’s a plain text format for structured data, built from key-value pairs and arrays, and it’s what most scraping APIs return because machines can work with it without the usual HTML cleanup mess.

Examples

A simple JSON response from a scraping API looks like this:

{
  "url": "https://example.com/product/123",
  "title": "Running Shoes",
  "price": "$79.99",
  "in_stock": true,
  "reviews": 214
}

In Python, you typically parse JSON into a normal dictionary:

import requests

response = requests.get("https://api.example.com/data")
data = response.json()

print(data["title"])
print(data["price"])

If you're scraping through ScrapeRouter, structured JSON is a lot easier to plug into a pipeline than raw HTML:

curl -X POST "https://www.scraperouter.com/api/v1/scrape/" \
  -H "Authorization: Api-Key $api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'

Practical tips

JSON is great when the site already exposes structured data through an API, or when your scraper returns extracted fields instead of full HTML.
Don’t assume every field is present: production data is messy, keys disappear, values come back as null, and types drift.
Store raw responses when debugging parsing issues. It saves time when something silently changes.
Validate shape before trusting it, especially in batch jobs:

def is_valid_product(data):
    required = ["title", "price"]
    return all(key in data and data[key] is not None for key in required)

Prefer JSON over hand-parsed HTML when you can. Less brittle, less cleanup, fewer weird edge cases later.

Use cases

Scraping pipelines: pass extracted data between services without dragging HTML around.
API responses: most scraping APIs return JSON because it’s easy to consume in Python, JavaScript, Go, and basically everything else.
Storage: save records as JSON lines or documents for batch processing, retries, and audits.
Monitoring changes: compare JSON fields over time, like price, stock status, or ranking, without reparsing whole pages.