Glossary

CSS

CSS usually means Cascading Style Sheets, the language browsers use to control how HTML looks on the page. In scraping, though, people often mean CSS selectors: the pattern syntax used to find elements like buttons, links, product titles, or price blocks inside a document.

Examples

Most scraping code that says "use CSS" really means use a CSS selector.

from bs4 import BeautifulSoup

html = """
<html>
  <body>
    <div class="product-card" data-sku="123">
      <h2 class="title">Running Shoes</h2>
      <span class="price">$79</span>
      <a href="/p/123">View product</a>
    </div>
  </body>
</html>
"""

soup = BeautifulSoup(html, "html.parser")

card = soup.select_one("div.product-card")
title = soup.select_one("h2.title").get_text(strip=True)
price = soup.select_one("span.price").get_text(strip=True)
link = soup.select_one("div.product-card a")["href"]

print(title, price, link)
# Common selector patterns
"div.product-card"          # element with class
"#main"                     # element with id
"a[href]"                   # element with attribute
"div[data-sku='123']"       # exact attribute match
"ul > li"                   # direct children
".product-card .price"      # nested descendant

In browser devtools, CSS selectors are usually the fastest way to test whether your extraction logic is sane before you write code.

Practical tips

  • Treat CSS as a locator tool, not a guarantee. A selector that works today can break next week because someone renamed a class or shuffled the DOM.
  • Prefer stable attributes over styling classes: data-*, semantic attributes, consistent container structure.
  • Avoid selectors that are too long. If your selector looks like a full DOM breadcrumb, it will probably die on the next frontend deploy.
  • Be careful with autogenerated class names from React, Vue, Tailwind-heavy builds, or CSS-in-JS setups: they often change, and they change for no useful reason.
  • Test selectors against multiple pages, not just one lucky example.
  • In production scraping, CSS selectors are usually fine for static extraction. Once the page is JS-heavy, delayed, or anti-bot protected, the harder part is getting a clean rendered page consistently, not writing div.price.
  • With ScrapeRouter, the point is not to replace CSS selectors. The point is to make the page retrieval layer less fragile so your selectors have a chance to keep working.
# Better: anchored on stable attributes
product = soup.select_one('[data-testid="product-card"]')
price = soup.select_one('[itemprop="price"]')

# Riskier: tied to presentation classes
price = soup.select_one('.text-red-500.font-bold.md\\:text-xl')

Use cases

  • Extracting product data: title, price, availability, image URLs, links
  • Pulling content from article pages: headline, author, publish date, body blocks
  • Navigating repeated page structures: search results, listing cards, table rows
  • Targeting elements in browser automation: click buttons, fill inputs, wait for components
  • Building parsers that are readable enough for another engineer to debug at 2 a.m.

Related terms

XPath HTML DOM Parser Headless Browser JavaScript Rendering Web Scraping