Glossary

Bounded Randomness

Bounded randomness means adding variation to scraper behavior inside sensible limits instead of using fixed, perfectly repeatable timing. In practice, that usually means random delays, dwell times, and request spacing that stay within a defined range, so you look less bot-like without turning the job into chaos.

Examples

A basic example is replacing a fixed 1-second delay with a range that still keeps throughput predictable.

import random
import time

for url in urls:
    fetch(url)
    time.sleep(random.uniform(0.4, 1.4))

For heavier pages, you usually want wider bounds because real users do not click through a JS-heavy flow every 800 ms forever.

import random
import time

for product_url in product_urls:
    render_and_extract(product_url)
    time.sleep(random.uniform(2.0, 8.0))

You can apply the same idea to login events or account actions.

import random
import time

for account in accounts:
    login(account)
    run_job(account)
    time.sleep(random.uniform(30, 180))

Practical tips

  • Keep the randomness bounded: random is useful, unbounded is just sloppy. Pick ranges based on page weight, session type, and how fast a real user could plausibly move.
  • Do not randomize everything: if your headers, TLS, IP, session behavior, and timing all look weird in different ways, you are not "human-like". You are just inconsistent.
  • Match delay ranges to the flow: simple listing pages: 400-1400 ms, paginated browsing: 1-3 s, JS-heavy detail pages: 2-8 s, account logins or checkout-like actions: much more spaced out.
  • Use it with concurrency control: bounded randomness helps with timing patterns, but it does not fix a scraper blasting 500 requests at once.
  • Be consistent enough to operate: the point is less predictability to the target, not less predictability to your own system. You still want stable job duration and capacity planning.
  • Skip browser delays when you do not need a browser: if the site exposes clean JSON or works fine over plain HTTP, use that. Adding random browser waits to a simple API scrape is just burning money.
  • If you use ScrapeRouter: this is the kind of thing you want handled as part of a routing and anti-bot strategy, not re-implemented badly in every scraper.

A simple helper:

import random
import time


def bounded_sleep(min_s, max_s):
    delay = random.uniform(min_s, max_s)
    time.sleep(delay)
    return delay

Use cases

  • Retail scraping: spacing product page requests so a crawl does not hit the same store with perfectly uniform timing for hours.
  • Search and SERP collection: adding realistic pauses between queries, especially when rotating IPs or sessions.
  • Authenticated scraping: spreading logins, dashboard loads, and export actions across time so account activity does not look machine-stamped.
  • Browser automation in production: slowing down only the flows that actually need it, instead of globally making every job expensive and slow.
  • Multi-provider scraping systems: combining bounded timing with proxy rotation, retries, and provider routing so failures do not cluster into obvious traffic spikes.

Related terms

Rate Limiting Request Throttling Concurrency Control Session Management Proxy Rotation Fingerprinting Anti-Bot Detection JavaScript Rendering