Examples
A few common cases where people reach for a headless browser:
- The page renders data only after JavaScript runs
- Content appears after clicking, scrolling, or waiting for XHR calls
- You need cookies, local storage, or browser execution context to look like a normal user session
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto("https://example.com/products", wait_until="networkidle")
print(page.title())
print(page.locator(".product-card").count())
browser.close()
# Typical pattern: run browser automation in headless mode
node scrape.js
Practical tips
- Don’t default to headless browsers for everything: they are slower, heavier, and more expensive than plain HTTP scraping
- Use them when the site actually needs rendering: client-side apps, interaction-heavy flows, bot checks tied to browser behavior
- Expect more operational mess in production: memory usage, browser crashes, timeouts, fingerprinting issues, proxy coordination
- Wait for the right thing, not just a fixed sleep: network idle, a selector, a specific API response
- If you only need the underlying API calls, inspect the network first: sometimes the browser is just an expensive way to discover a JSON endpoint
- At scale, the hard part is not launching Playwright once: it is keeping browser sessions stable, unblocked, and affordable over time
- If you're using ScrapeRouter, this is the kind of thing you route only when needed: simple pages through cheaper request-based scraping, browser-required pages through a headless path
Use cases
- Scraping JavaScript-heavy storefronts where product data is injected after page load
- Logging into sites that rely on browser state: cookies, tokens, redirects, local storage
- Interacting with filters, pagination, infinite scroll, and click-to-reveal content
- Capturing rendered HTML or screenshots for monitoring and QA-style checks
- Handling anti-bot flows where a plain HTTP client gets blocked but a browser session has a better chance