Glossary

JA3/JA4 fingerprint

JA3 and JA4 are TLS client fingerprints used to identify patterns in how a client starts HTTPS connections. In scraping, they matter because many bot defenses use them to spot traffic from default Python HTTP stacks, headless tooling, or other non-browser clients before your request even gets to the page.

Examples

A simple example: two clients send the same HTTP headers, but one uses Chrome's TLS handshake and the other uses a default Python TLS stack. The site can treat them differently because the TLS fingerprint is different.

import requests

resp = requests.get("https://example.com")
print(resp.status_code)

That request may look fine at the HTTP layer, but the TLS handshake behind it can still stand out.

Another common production pattern is this: your scraper works on small tests, then starts getting blocked when volume goes up because the target is clustering requests by TLS fingerprint, not just by IP or headers.

curl -X POST https://www.scraperouter.com/api/v1/scrape/ \
  -H "Authorization: Api-Key $api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product/123"
  }'

The point of using a router layer here is not magic. It's avoiding the usual mess of hand-tuning clients, proxies, and browser infrastructure just to stop your TLS fingerprint from looking obviously wrong.

Practical tips

  • Don't assume header spoofing is enough: a realistic User-Agent with a bad TLS fingerprint still gets flagged.
  • Watch for blocks that happen before any meaningful page response: 403s, challenge pages, connection resets, or traffic that only fails on stricter targets.
  • Test different request stacks: requests, httpx, curl, browser automation, and routed APIs can produce very different TLS fingerprints.
  • Treat JA3/JA4 as one signal, not the whole system: sites also look at IP reputation, cookies, request timing, HTTP/2 behavior, and inter-request patterns.
  • If you're scraping easy targets, don't over-engineer this: you only need to care when TLS-level detection is actually part of the blocking.
  • In production, optimize for stability not just first success: getting one request through is easy, keeping a job alive for 500k requests is the real test.

Use cases

  • Debugging why a scraper with correct headers still gets blocked on login pages, search pages, or high-value product endpoints.
  • Explaining why browser-based scraping succeeds while a lightweight HTTP client fails against the same URL.
  • Routing traffic through infrastructure that presents more realistic network fingerprints when direct requests are too easy to classify.
  • Reducing maintenance work when anti-bot systems start keying on default TLS signatures from common scraping libraries.

Related terms

TLS fingerprinting HTTP fingerprinting Bot detection User-Agent Proxy rotation Headless browser Cloudflare Rate limiting