Glossary

JWT

JWT stands for JSON Web Token: a compact token format used to send claims like user identity, expiration, and permissions between a client and a server. In scraping, you mostly run into JWTs when an API expects a Bearer token after login, especially on SPAs and mobile app backends.

Examples

A JWT is typically sent in the Authorization header:

curl 'https://api.example.com/v1/profile' \
  -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjMiLCJleHAiOjE3MjAwMDAwMDB9.signature'

In Python, you can reuse the token for authenticated API requests:

import requests

jwt_token = "eyJhbGciOi..."
headers = {
    "Authorization": f"Bearer {jwt_token}",
    "Accept": "application/json",
}

resp = requests.get("https://api.example.com/v1/orders", headers=headers, timeout=30)
print(resp.status_code)
print(resp.text)

If you want to inspect a JWT payload during debugging, split it into its three parts: header, payload, signature. The payload is base64url-encoded JSON, not encrypted by default.

import base64
import json

jwt_token = "eyJhbGciOi...header.eyJzdWIiOiIxMjMiLCJleHAiOjE3MjAwMDAwMDB9.signature"
payload = jwt_token.split(".")[1]
padding = "=" * (-len(payload) % 4)
decoded = base64.urlsafe_b64decode(payload + padding)
print(json.loads(decoded))

Practical tips

  • Do not treat a JWT like a password: it is often short-lived, tied to a session or device, and can expire fast.
  • Check the exp claim before blaming proxies or parsing code. A lot of "random" 401s are just expired tokens.
  • JWT payloads are readable unless the system uses encryption on top. Do not assume the token is secret just because it looks opaque.
  • Mobile app APIs often use JWTs after a login flow: you may need to reproduce the auth request first, then reuse the returned Bearer token for the data endpoints.
  • Do not hardcode a token into a long-running scraper unless you enjoy babysitting breakage. Build refresh logic or re-login logic.
  • Watch for token binding: some backends tie the JWT to cookies, device IDs, app version headers, or IP patterns.
  • If the target is browser-protected and the token only appears after heavy frontend work, use ScrapeRouter to handle the page fetch and session side first, then extract what you need from the resulting traffic or page state.
  • Do not try to forge JWTs unless you actually control the signing key. Reading a payload is easy, generating a valid signed token is not.

Use cases

  • Scraping a private dashboard API after logging in through the site once and reusing the returned Bearer token.
  • Pulling data from mobile app endpoints where the app exchanges credentials for a JWT, then sends that token with every API request.
  • Debugging why an authenticated scraper suddenly gets 401 responses: expired JWT, missing refresh flow, wrong audience claim, or missing companion cookies.
  • Inspecting token payloads to understand session lifetime, user roles, tenant IDs, or which backend environment the app is talking to.

Related terms

Bearer Token Authentication Session Cookie Access Token Refresh Token API Endpoint HTTP Headers