Behavioral detection | ScrapeRouter

Behavioral detection is a class of anti-bot checks that looks at how a visitor behaves, not just what IP, headers, or fingerprint they show up with. It flags automation when timing, scrolling, clicks, navigation flow, and page interaction patterns look too clean, too fast, or too mechanically consistent to be a real user.

Examples

A scraper can pass basic checks and still get blocked because its behavior gives it away.

Too-fast navigation: loading product pages every 300ms with no reading time
No real interaction: never scrolling, never moving the mouse, never focusing inputs, just requesting pages in sequence
Overly perfect timing: fixed delays like exactly 2 seconds between every action
Impossible sessions: opening pages, clicking buttons, and submitting forms faster than a person could reasonably do it

import random
import time
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.goto("https://example.com")

    time.sleep(random.uniform(1.2, 3.8))
    page.mouse.move(200, 300, steps=12)
    page.mouse.wheel(0, random.randint(400, 900))
    time.sleep(random.uniform(0.8, 2.1))

    page.click("text=Next")
    browser.close()

That kind of behavior simulation can reduce obvious bot signals, but it does not magically make a browser session look human. If the rest of the setup is bad, behavioral detection still catches it.

Practical tips

Treat behavioral detection as a separate layer from IP reputation, TLS fingerprinting, header checks, and browser fingerprinting. Passing one layer does not mean you're clean.
Avoid fixed delays. Use bounded randomness that matches the page type: search results, product detail, login, checkout, and pagination all have different normal interaction patterns.
Don’t fake interaction everywhere just because you can. Random mouse movement on every page often looks more suspicious, not less.
Keep session flow believable: landing page, list page, detail page, maybe back to list. Real users do not hit 40 product URLs in a perfectly linear loop.
Watch for challenge pages, hidden JavaScript checks, and soft blocks like empty results or missing content. Behavioral systems often degrade responses before they return a hard 403.
In production, the real problem is consistency. A setup that works for one test run often falls apart across volume, longer sessions, or different targets.
If you use a scraping API or router layer, make sure it can handle browser execution and anti-bot adaptation when needed. This is exactly where plain HTTP clients start wasting engineering time.

Use cases

Retail scraping: product pages load fine for a while, then pagination and search start failing because the browsing pattern is too aggressive
Account workflows: login, search, and form flows trigger bot defenses because actions happen faster and more consistently than human sessions
SERP and aggregator scraping: repeated query-submit-click loops get flagged even with rotated proxies because the interaction pattern is robotic
Protected sites with JS-heavy defenses: the site waits for client-side behavior signals before deciding whether to serve content, challenge, or silently throttle