Examples
A simple version looks like this:
- Server returns a challenge before the real page
- Client solves it in the browser or in a headless session
- Server verifies the answer quickly and sets a temporary pass token
curl https://target-site.example
# returns a PoW challenge page instead of the content
In scraping, the annoying part is that PoW changes the cost model. One request is fine. Ten thousand requests suddenly mean real CPU time, more latency, and more infrastructure burn.
A rough flow:
{
"challenge": "find a nonce so sha256(prefix + nonce) starts with 000000",
"difficulty": 6,
"expires_in": 30
}
The client brute-forces a nonce, sends it back, and the server checks it fast. That asymmetry is the whole point: expensive enough to slow bots down, cheap enough for the server to verify.
Practical tips
- Treat PoW as a cost layer, not a hard block: it won’t stop determined scrapers, it just makes large-scale scraping more expensive.
- Measure the real overhead: CPU time, request latency, concurrency limits, and retry behavior all get worse once PoW is in the path.
- Watch for challenge expiry: some implementations give you a very short window, so solving too slowly or queueing requests badly causes avoidable failures.
- Don’t confuse PoW with a CAPTCHA: PoW tests compute, CAPTCHA tests interaction. They solve different problems and fail in different ways.
- In production scraping, budget for it explicitly: more workers can actually make things worse if every worker is burning CPU just to earn the right to make the next request.
- If you’re routing traffic through scraping infrastructure, detect it early: you want to know whether a target is using PoW before you scale the job, not after your costs jump.
- If ScrapeRouter is in the stack: this is exactly the kind of anti-bot detail you want abstracted away, because solving one target’s challenge is easy, keeping it working across many targets is the part that wastes engineering time.
Use cases
- Bot mitigation for websites: small sites and API endpoints use PoW to make bulk automated access more expensive without blocking normal users too aggressively.
- Protection against scraper floods: tools like Anubis-style web firewalls put a PoW challenge in front of content so every client has to pay a little compute tax.
- Rate control without full identity checks: instead of requiring login or heavy fingerprinting, a site can require proof that the client spent some CPU effort.
- Blockchain and distributed systems: outside web scraping, PoW is also the mechanism used in some cryptocurrencies to prove miners did computational work before adding blocks.