CSS | ScrapeRouter

CSS usually means Cascading Style Sheets, the language browsers use to control how HTML looks on the page. In scraping, though, people often mean CSS selectors: the pattern syntax used to find elements like buttons, links, product titles, or price blocks inside a document.

Examples

Most scraping code that says "use CSS" really means use a CSS selector.

from bs4 import BeautifulSoup

html = """
<html>
  <body>
    <div class="product-card" data-sku="123">
      <h2 class="title">Running Shoes</h2>
      <span class="price">$79</span>
      <a href="/p/123">View product</a>
    </div>
  </body>
</html>
"""

soup = BeautifulSoup(html, "html.parser")

card = soup.select_one("div.product-card")
title = soup.select_one("h2.title").get_text(strip=True)
price = soup.select_one("span.price").get_text(strip=True)
link = soup.select_one("div.product-card a")["href"]

print(title, price, link)

# Common selector patterns
"div.product-card"          # element with class
"#main"                     # element with id
"a[href]"                   # element with attribute
"div[data-sku='123']"       # exact attribute match
"ul > li"                   # direct children
".product-card .price"      # nested descendant

In browser devtools, CSS selectors are usually the fastest way to test whether your extraction logic is sane before you write code.

Practical tips

Treat CSS as a locator tool, not a guarantee. A selector that works today can break next week because someone renamed a class or shuffled the DOM.
Prefer stable attributes over styling classes: data-*, semantic attributes, consistent container structure.
Avoid selectors that are too long. If your selector looks like a full DOM breadcrumb, it will probably die on the next frontend deploy.
Be careful with autogenerated class names from React, Vue, Tailwind-heavy builds, or CSS-in-JS setups: they often change, and they change for no useful reason.
Test selectors against multiple pages, not just one lucky example.
In production scraping, CSS selectors are usually fine for static extraction. Once the page is JS-heavy, delayed, or anti-bot protected, the harder part is getting a clean rendered page consistently, not writing div.price.
With ScrapeRouter, the point is not to replace CSS selectors. The point is to make the page retrieval layer less fragile so your selectors have a chance to keep working.

# Better: anchored on stable attributes
product = soup.select_one('[data-testid="product-card"]')
price = soup.select_one('[itemprop="price"]')

# Riskier: tied to presentation classes
price = soup.select_one('.text-red-500.font-bold.md\\:text-xl')

Use cases

Extracting product data: title, price, availability, image URLs, links
Pulling content from article pages: headline, author, publish date, body blocks
Navigating repeated page structures: search results, listing cards, table rows
Targeting elements in browser automation: click buttons, fill inputs, wait for components
Building parsers that are readable enough for another engineer to debug at 2 a.m.