Examples
A simple JSON response from a scraping API looks like this:
{
"url": "https://example.com/product/123",
"title": "Running Shoes",
"price": "$79.99",
"in_stock": true,
"reviews": 214
}
In Python, you typically parse JSON into a normal dictionary:
import requests
response = requests.get("https://api.example.com/data")
data = response.json()
print(data["title"])
print(data["price"])
If you're scraping through ScrapeRouter, structured JSON is a lot easier to plug into a pipeline than raw HTML:
curl -X POST "https://www.scraperouter.com/api/v1/scrape/" \
-H "Authorization: Api-Key $api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com"
}'
Practical tips
- JSON is great when the site already exposes structured data through an API, or when your scraper returns extracted fields instead of full HTML.
- Don’t assume every field is present: production data is messy, keys disappear, values come back as
null, and types drift. - Store raw responses when debugging parsing issues. It saves time when something silently changes.
- Validate shape before trusting it, especially in batch jobs:
def is_valid_product(data):
required = ["title", "price"]
return all(key in data and data[key] is not None for key in required)
- Prefer JSON over hand-parsed HTML when you can. Less brittle, less cleanup, fewer weird edge cases later.
Use cases
- Scraping pipelines: pass extracted data between services without dragging HTML around.
- API responses: most scraping APIs return JSON because it’s easy to consume in Python, JavaScript, Go, and basically everything else.
- Storage: save records as JSON lines or documents for batch processing, retries, and audits.
- Monitoring changes: compare JSON fields over time, like price, stock status, or ranking, without reparsing whole pages.