Keep-Alive | ScrapeRouter

Keep-Alive means keeping a network connection open instead of tearing it down after every request. In scraping, that usually means reusing the same TCP or HTTP connection for multiple requests to the same site, which saves time and overhead and can make high-volume crawls a lot less wasteful.

Examples

A simple example with Python requests using a session, which reuses connections under the hood:

import requests

session = requests.Session()
session.headers.update({"User-Agent": "Mozilla/5.0"})

for path in ["/", "/products", "/pricing"]:
    r = session.get(f"https://example.com{path}", timeout=30)
    print(path, r.status_code, len(r.text))

With curl, HTTP keep-alive is usually handled automatically when the server supports it:

curl --http1.1 -H "Connection: keep-alive" https://example.com/

In practice, the useful part is not the header itself. The useful part is reusing a client or session instead of creating a brand new connection for every request.

Practical tips

Use sessions or connection pooling: in Python that usually means requests.Session(), in Node it means using an HTTP agent with keep-alive enabled.
It helps most when you hit the same domain repeatedly: detail pages, pagination, API endpoints, asset fetches.
Don’t expect miracles: keep-alive reduces connection setup overhead, but it does not fix bad proxies, rate limits, TLS fingerprint issues, or browser-level blocking.
Watch idle timeouts: servers, proxies, and load balancers often close idle connections after a short period, so reused connections can still die underneath you.
Be careful with flaky proxy networks: some proxies claim to support persistent connections and then silently drop them, which turns into random retries and weird failures.
At browser scale, this matters too: if you run Playwright or Puppeteer against the same origin a lot, persistent connections can reduce wasted handshakes and improve throughput.
With ScrapeRouter: this is the kind of low-level plumbing you usually don’t want to babysit yourself. In production, the annoying part is not knowing what keep-alive is, it’s figuring out which upstreams actually honor it and stay stable under load.

Use cases

Crawling lots of pages on the same domain without paying the TCP and TLS setup cost every time.
Pulling paginated API data where you make hundreds or thousands of requests to one host.
Browser automation workloads that repeatedly hit the same backend and benefit from connection reuse.
Internal scraper services where reducing connection churn lowers latency and infrastructure waste.

This is one of those things that sounds small until you run it at volume. On one script, you barely notice it. On a real crawl, opening a fresh connection for every request is just unnecessary drag.