Scrape
Reference for the /api/v1/scrape/ request and response schema.
The scrape endpoint submits scraping jobs to ScrapeRouter. Each request targets a URL using automatic route selection by default, or a specific scraper and optional proxy configuration when provided. On this page we cover the request model and how to create a synchronous scrape request.
Request Schema
Attributes of the scrape request object (ScrapeRequestSchema).
| Attribute | Type | Description | Default |
|---|---|---|---|
url
required
|
string (URL) | Absolute http/https URL to scrape. | — |
method
|
string | HTTP method used for the target request. |
GET
|
headers
|
object | Headers sent to the target URL. | — |
query
|
array | object | Query parameters appended to the target URL. | — |
data
|
any | Request body; body type is auto-detected (JSON, form, or raw). | — |
cookies
|
array | object | Cookies for the target request. | — |
timeout_ms
|
integer | number | Target request timeout in milliseconds. | — |
allow_redirects
|
boolean | Whether the target request should follow redirects. | — |
browser_type
|
string | Browser engine for browser-capable scrapers when supported. | — |
headless
|
boolean | Run browser-capable scrapers in headless mode when supported. | — |
wait_for_selector
|
string | CSS selector to wait for when supported. | — |
wait_for_timeout_ms
|
integer | number | Additional wait timeout in milliseconds when supported. | — |
wait_for_load_state
|
string | Browser load state to wait for when supported. | — |
page_actions
|
array | Browser actions such as click, scroll, wait, or evaluate when supported. | — |
screenshot
|
boolean | Request screenshot capture when the selected scraper supports it. | — |
scraper
|
string | Scraper identifier, or auto for route selection. |
auto
|
scraper_options
|
object | Advanced scraper-native options; units and names are scraper-specific. |
{}
|
proxy
|
any | Proxy config object or proxy type string: datacenter, residential, mobile |
datacenter
|
scraperouter
|
any | Optional client metadata reserved for ScrapeRouter routing and diagnostics. | — |
Response Schema
Attributes of the scrape response object (ScrapeResponseSchema).
Optional fields are omitted when no value is available.
| Attribute | Type | Description | Default |
|---|---|---|---|
id
|
uuid | Scrape.id | — |
status_code
|
integer | Target response status code or scraper status. Inspect this field and errors even when HTTP is 200. | — |
final_url
|
string | Final target URL after redirects, when available. | — |
headers
|
object | Target response headers. |
{}
|
content
|
string | Target response body as text or base64. | — |
content_encoding
|
string | Encoding of content: text for decoded text, base64 for binary bodies. | — |
cookies
|
array | object | Cookies returned by the target response, when available. | — |
errors
|
array | Scraper or target-level errors for this attempt. | — |
screenshot_url
|
string | First saved screenshot artifact URL, when available. | — |
scraper_data
|
object | Scraper-specific response data, when available. | — |
scraperouter
|
any | ScrapeRouter routing metadata such as selected scraper, proxy type, and request cost. | — |
Create a scrape request
POST
/api/v1/scrape/
Creates a new scraping request and returns the result synchronously. HTTP 200 means the API request completed; check JSON status_code and errors for the target scrape result.
Required attributes
| Parameter | Description |
|---|---|
url
|
The URL to scrape |
Optional attributes
| Parameter | Description |
|---|---|
method
|
HTTP method. Default: "GET"
|
headers
|
Custom request headers |
proxy
|
Proxy type or config. Default: "datacenter"
|
scraper
|
Scraper identifier to force, or "auto". Default: "auto"
|
timeout_ms
|
Target request timeout in milliseconds |
screenshot
|
Requests screenshot capture from supported browser-capable scrapers. When an artifact is produced, the response includes screenshot_url; otherwise the field is omitted.
|
Result status
ScrapeRouter separates API transport status from the target scrape result. Validation, authentication, credit, concurrency, and platform timeouts use HTTP 4xx/5xx. A completed scrape attempt returns HTTP 200, even when the target result is a block, timeout, or scraper error. Treat JSON status_code in the 200-399 range with no errors as a successful target scrape.
Advanced options
Prefer top-level normalized fields such as timeout_ms, wait_for_timeout_ms, and screenshot. Values inside scraper_options are scraper-native overrides; their names and units are defined by the selected scraper. For Playwright, native scraper_options.timeout is milliseconds.
Request
#!/usr/bin/env bash
curl -X POST https://www.scraperouter.com/api/v1/scrape/ \
-H "Authorization: Api-Key {your_api_key}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"scraper": "auto",
"proxy": "datacenter"
}'
import requests
response = requests.post(
"https://www.scraperouter.com/api/v1/scrape/",
headers={"Authorization": "Api-Key {your_api_key}"},
json={
"url": "https://example.com",
"scraper": "auto",
"proxy": "datacenter",
},
)
data = response.json()
const response = await fetch("https://www.scraperouter.com/api/v1/scrape/", {
method: "POST",
headers: {
"Authorization": "Api-Key {your_api_key}",
"Content-Type": "application/json",
},
body: JSON.stringify({
url: "https://example.com",
scraper: "auto",
proxy: "datacenter",
}),
});
const data = await response.json();
Response
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status_code": 200,
"final_url": "https://example.com",
"content": "<!doctype html>...",
"content_encoding": "text",
"headers": {
"content-type": "text/html; charset=UTF-8"
},
"scraperouter": {
"scraper": "apiritif/curl-cffi:0.14",
"request_cost": null,
"proxy_type": "datacenter"
}
}