Glossary

local storage

Local storage is the browser's built-in key-value store that lets a website save data on a user's device and keep it across page reloads and browser restarts. In scraping, it matters because many modern apps read tokens, flags, or cached state from local storage, which means plain HTTP requests often miss part of what the app is actually doing.

Examples

A lot of SPAs stash auth state, feature flags, or API config in local storage. If you only fetch the HTML, you won't see any of that.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")

    token = page.evaluate("() => window.localStorage.getItem('auth_token')")
    print(token)

    browser.close()

You can also set local storage before loading a page if the app expects something to already be there.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")
    page.evaluate("() => window.localStorage.setItem('region', 'us')")
    page.reload()
    browser.close()

Practical tips

  • Local storage is browser-only: you need a real browser context or JS-capable renderer to read or write it.
  • It's scoped by origin: https://app.example.com and https://www.example.com do not share the same local storage.
  • It persists: unlike session storage, it survives tab closes and browser restarts unless cleared.
  • Don't assume it's trustworthy: apps often put useful state there, but it's still client-side data and can be stale.
  • For scraping, this is a production gotcha: if login state or API config lives in local storage, cheap request-only scraping breaks fast and you end up debugging the app, not the page.
  • If the site depends on local storage, use browser automation: that's the clean answer. Trying to fake everything with raw requests gets fragile fast.

Use cases

  • Reading auth state: some apps store bearer tokens or session metadata in local storage.
  • Capturing app configuration: frontend code may pull API base URLs, region settings, or feature flags from local storage.
  • Reproducing logged-in flows: setting expected keys before navigation can help load the same state a real user sees.
  • Debugging scraper failures: when a page works in a browser but not in your scraper, missing local storage is one of the first things worth checking.

Related terms

cookies session storage headless browser javascript rendering browser automation single-page application