Glossary

Session

A session is the state a site keeps across multiple requests so it can treat them as coming from the same user flow. In scraping, that usually means cookies, auth state, cart state, CSRF tokens, or other bits that need to persist, otherwise things work for one request and then quietly break on the next.

Examples

A basic example is logging in once, then reusing the same cookies for the pages that come after it.

import requests

s = requests.Session()

login = s.post(
    "https://example.com/login",
    data={"email": "user@example.com", "password": "secret"}
)

page = s.get("https://example.com/account/orders")
print(page.status_code)
print(page.text[:200])

Without a session, the second request may look like a brand new visitor and get redirected back to login.

Another common case is a site setting a CSRF cookie on one page, then expecting it on the next form submit. If you drop the session, that flow fails.

curl -c cookies.txt https://example.com/login > /dev/null
curl -b cookies.txt -c cookies.txt -X POST https://example.com/login \
  -d "email=user@example.com&password=secret"

In production scraping, session continuity also matters for things like paginated search flows, location selection, and anti-bot checks that expect a believable sequence of requests.

Practical tips

  • Persist cookies across related requests: login, search, add-to-cart, checkout previews, account pages
  • Keep session-bound headers consistent: user-agent, accept-language, sec-ch headers; changing these mid-flow is an easy way to look fake
  • Watch for hidden state: CSRF tokens, signed request params, local storage backed tokens, region selection
  • Do not reuse one session forever: sessions expire, get invalidated, or accumulate bad state
  • Separate sessions by account or workflow: mixing them creates weird bugs that are hard to debug
  • Expect session pinning: some sites tie a session to IP, fingerprint, or both; if the IP changes between requests, the session may die
  • Log enough to debug: response code, redirect chain, Set-Cookie headers, final URL, whether the session was reused

If you are scraping through a router layer, the real issue is usually not "can I store cookies". It's can I keep the same identity long enough for the site to believe the flow is one user. That's where sessions get annoying in production: cookie jar, proxy continuity, and browser state all have to line up.

Use cases

  • Authenticated scraping: log in once, then pull account data, invoices, order history, or internal dashboards
  • Multi-step flows: search results to product page to seller page, or form page to submit page to confirmation page
  • Geo or preference state: sites that store selected country, currency, zip code, or store location in session state
  • Cart and pricing flows: some pricing only appears after a session picks up location, inventory context, or user segment
  • Anti-bot-sensitive targets: sites that allow a few requests, then start checking whether the full request sequence still looks coherent

With ScrapeRouter, session-heavy jobs usually matter when you need more than a one-off fetch. If the target expects state to persist across requests, you need session handling that survives real production conditions, not just something that worked once in a notebook.

Related terms

Cookies Authentication CSRF Token Proxy Rotation Sticky Session Headless Browser