Examples
A simple REST API usually maps cleanly to HTTP methods and URLs:
curl -X GET "https://api.example.com/products/123"
{
"id": 123,
"name": "Running Shoes",
"price": 79.99
}
Create something:
curl -X POST "https://api.example.com/products" \
-H "Content-Type: application/json" \
-d '{"name":"Running Shoes","price":79.99}'
Typical pattern:
GET /products/123: fetch one resourceGET /products: list resourcesPOST /products: create a resourcePUT /products/123: replace or update a resourceDELETE /products/123: remove a resource
In scraping, this matters because some sites expose internal REST endpoints behind the frontend. If you can call those directly, it is often cleaner and cheaper than parsing HTML, until auth, rate limits, or anti-bot controls get in the way.
Practical tips
- Don't confuse REST-ish with truly well-designed REST. A lot of APIs use JSON over HTTP and call it REST. Fine. What matters is whether it is predictable to work with.
- For scraping and automation, REST endpoints are often easier to monitor than browser flows: stable URLs, clearer payloads, simpler retries.
- Production reality: internal REST APIs used by websites change without notice. They are usually less noisy than HTML, but they are not a contract unless the provider documents them publicly.
- Watch the basics first: status codes, pagination, auth tokens, rate limits, cache headers.
- If an API starts returning 401, 403, or empty 200 responses, that usually means auth or anti-bot logic changed, not that your parser broke.
- ScrapeRouter fits in when the "just call the endpoint" approach stops being reliable: session handling, proxying, browser escalation, and routing around providers that work for one target but not another.
Quick example of a normal REST fetch in Python:
import requests
resp = requests.get(
"https://api.example.com/products",
params={"page": 1, "limit": 50},
timeout=30,
)
resp.raise_for_status()
data = resp.json()
print(data)
Use cases
- Public APIs: clean integration when the provider actually wants you to consume data that way.
- Internal website APIs: many modern apps load page data from REST endpoints in the background, which can be easier to extract than rendered HTML.
- Scraping pipelines: faster and cheaper data collection when you can avoid full browser rendering.
- Operational monitoring: easier to debug than browser automation because requests, headers, payloads, and response codes are more obvious.
- Hybrid scraping setups: use REST when it works, fall back to browser-based collection when auth, JavaScript flows, or anti-bot systems make direct calls too fragile.