Glossary

Burst limit

A burst limit is the maximum number of requests you can send in a short spike before rate limiting kicks in. It matters because many systems allow brief bursts above the steady request rate, but they still block you if that spike is too large or happens too often.

Examples

A site might allow 5 requests per second on average, but also allow a short burst of 20 requests at once before returning 429s. That sounds fine in testing, then falls apart in production when a queue flush, retry storm, or new worker batch all hit at the same time.

import time
import requests

url = "https://example.com/data"
headers = {"User-Agent": "scraper"}

for i in range(20):
    r = requests.get(url, headers=headers)
    print(i, r.status_code)
    time.sleep(0.05)

If the target has a low burst limit, the first few requests may work and the rest may start returning 429 Too Many Requests even though your long-term average rate is not that high.

Practical tips

  • Don't just control average request rate: control concurrency and short-window spikes too.
  • Watch for the real production causes of bursts: queue drains, retries, autoscaling, cron jobs starting at the same minute.
  • Token bucket style throttling is common here: it lets you send small bursts, then refills over time.
  • If you're scraping at scale, spread requests across time instead of letting workers fire all at once.
  • If a target keeps rate limiting on spikes, back off fast instead of hammering through it.
curl "https://www.scraperouter.com/api/v1/scrape/" \
  -H "Authorization: Api-Key $api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'

With a router layer, you still need to think about burst behavior. The annoying part in production is that the problem is often not total volume, it's 50 requests landing in the same second.

Use cases

  • Job queue flushes: a backlog clears and workers suddenly send a big wave of requests.
  • Retry storms: failed requests retry together and create a second spike worse than the first.
  • Autoscaled scraper fleets: new instances come online and all start fetching immediately.
  • Multi-tenant scraping APIs: one customer or workload can create short spikes that trip provider or target-side limits.
  • E-commerce and search scraping: pagination, product detail fetches, and enrichment steps can bunch up into bursts if not scheduled carefully.

Related terms

Rate limit 429 Too Many Requests Token bucket Concurrency limit Backoff Retry strategy Throttling Proxy rotation