Examples
A common workflow is: open the page, open DevTools, reload, then filter for XHR or Fetch requests.
You might see a request like this:
curl 'https://example.com/api/products?page=2' \
-H 'accept: application/json' \
-H 'user-agent: Mozilla/5.0' \
-H 'cookie: session=abc123'
And the response is the thing you actually want:
{
"products": [
{"id": 101, "name": "Widget", "price": 19.99},
{"id": 102, "name": "Cable", "price": 4.99}
],
"next_page": 3
}
At that point, scraping the rendered DOM is often pointless. You can just request the same endpoint directly:
import requests
url = "https://example.com/api/products?page=2"
headers = {
"accept": "application/json",
"user-agent": "Mozilla/5.0",
}
r = requests.get(url, headers=headers, timeout=30)
print(r.json())
If the site is protected, the network tab still tells you what the browser is trying to call. That helps you decide whether a direct request is enough or whether you need a browser, session handling, proxies, or a router layer like ScrapeRouter.
Practical tips
- Filter by Fetch/XHR first: this cuts out fonts, images, and other junk.
- Reload the page with DevTools open: many useful requests only appear during initial page load.
- Look at headers, query params, cookies, and payloads: the URL alone is often not enough.
- Check the response tab before building a scraper: if the data is already in JSON, don't waste time parsing HTML.
- Watch for pagination and cursors: page number, offset, cursor token, next URL.
- Compare repeated actions: click "next", apply a filter, open a product, then see what request changed.
- Be careful with one-off success: a request that works once in your browser may depend on session cookies, CSRF tokens, or fingerprinting.
- If direct replay keeps failing in production, that's the point where a simple requests script stops being cheap.
- For protected targets, send the page through ScrapeRouter instead of rebuilding anti-bot handling yourself:
curl 'https://www.scraperouter.com/api/v1/scrape/' \
-H 'Authorization: Api-Key $api_key' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/products"
}'
Use cases
- Finding hidden JSON endpoints behind a JavaScript-heavy page.
- Figuring out whether a site needs browser automation at all, or just a direct API-style request.
- Reverse-engineering search, pagination, filters, and lazy loading.
- Debugging why scraped HTML does not match what you see in the browser: the page may hydrate from a separate request after load.
- Checking what authentication state matters: cookies, bearer token, CSRF token, request headers.
- Reducing cost and fragility: if the network tab shows a clean data endpoint, you can often skip full browser rendering.
- Understanding where simple scraping stops working in production: some requests are easy to copy locally and annoying to keep alive at scale.