Examples
A HAR file is mostly useful during debugging and reverse engineering. You open a page in Chrome DevTools, export the network log, and inspect the requests that returned the data you actually care about.
{
"log": {
"entries": [
{
"request": {
"method": "GET",
"url": "https://example.com/api/products?page=1"
},
"response": {
"status": 200,
"content": {
"mimeType": "application/json"
}
},
"time": 184.32
}
]
}
}
Once you spot the real data endpoint in the HAR, you can often replay it directly instead of scraping rendered HTML.
curl 'https://example.com/api/products?page=1' \
-H 'accept: application/json' \
-H 'x-requested-with: XMLHttpRequest' \
-H 'cookie: session=abc123'
Practical tips
- Filter noise fast: look for
fetchand XHR requests first, then GraphQL, then JSON responses. Ignore fonts, images, and analytics unless the site is doing something weird. - Pay attention to sequence: some requests only work after a config call, CSRF token fetch, or session cookie is set. The HAR shows that order, which matters more than people think.
- Check headers and payloads: auth tokens, cursor params, locale settings, and client hints are often the difference between
200and403. - Watch for short-lived values: HAR files can include temporary cookies, bearer tokens, and signed URLs. Good for debugging, bad if you treat them like permanent inputs.
- Don’t overfit to one capture: one HAR from one browser session is a clue, not the full system. Repeat the flow a few times and compare what changes.
- Be careful with secrets: HAR exports can contain cookies, auth headers, and personal data. Don’t paste them into tickets or ship them around Slack like it’s nothing.
- Use HAR to reduce browser usage: if the page data comes from a clean JSON endpoint, you may not need full browser automation in production at all. That saves money and removes a lot of failure modes.
Use cases
- Finding hidden APIs: a page looks heavily rendered, but the actual data comes from
/api/searchor a GraphQL POST behind the scenes. - Reproducing browser requests: you need to copy the exact headers, cookies, and payload shape that made the request work.
- Debugging blocked scrapers: the browser succeeds, your script gets blocked, and the HAR helps you compare what is missing.
- Understanding pagination: page numbers are often fake; the HAR shows the real cursor, offset, or continuation token.
- Reducing maintenance: instead of parsing unstable HTML, you move to the underlying JSON call the frontend already depends on.
- Validating render necessity: if the HAR shows the data is fetched client-side after load, you can decide whether you need JavaScript rendering or just the backend request.
- Troubleshooting scraping infrastructure: when using a scraper API or router layer, HAR-style inspection helps separate target-site issues from proxy, header, or session problems.