Glossary

IFrame

An IFrame is an HTML element that embeds one web page inside another. In scraping, this matters because the data you want often is not in the main page HTML at all, but loaded from the iframe's src as a separate document with its own requests, cookies, and sometimes its own anti-bot problems.

Examples

A lot of pages look like they contain the content you want, but the actual data is sitting in an iframe.

<iframe src="https://widgets.example.com/reviews?id=123" loading="lazy"></iframe>

If you scrape only the parent page, you may just get the iframe tag and nothing inside it. The real content is usually at the src URL.

from bs4 import BeautifulSoup
import requests

html = requests.get("https://example.com/product/123").text
soup = BeautifulSoup(html, "html.parser")
iframe = soup.select_one("iframe")

if iframe and iframe.get("src"):
    iframe_html = requests.get(iframe["src"]).text
    print(iframe_html[:500])

With browser automation, you may need to switch into the frame before selecting elements.

frame = page.frame(url=lambda url: "widgets.example.com" in url)
content = frame.locator(".review").all_text_contents()

Practical tips

  • Check the HTML first: if the content is missing from the main response, inspect for iframe elements and follow the src.
  • Treat the iframe as a separate page: separate document, separate network requests, separate failure modes.
  • Watch for cross-origin issues: in a browser context, you usually cannot directly read a cross-origin iframe from client-side JavaScript.
  • Don't assume one request is enough: parent page loads, then iframe loads, then the iframe may call APIs after that.
  • Look at the network tab: often the iframe is just a wrapper around an easier JSON or XHR endpoint.
  • In production, iframes break scrapers quietly: the page still loads, but the data is empty because your parser never followed the embedded document.
  • With ScrapeRouter: if a target relies on browser rendering, frame handling is part of the job. You still want to know whether the data is in the parent DOM, the iframe DOM, or an API behind it, because that affects cost and reliability.

Use cases

  • Scraping embedded reviews, maps, booking widgets, payment forms, and chat widgets.
  • Pulling content from third-party embeds that are isolated from the main site.
  • Debugging why a selector works in DevTools but returns nothing from a simple HTTP scraper.
  • Deciding whether you need raw HTTP, browser automation, or direct API extraction.

Related terms

DOM HTML Headless Browser JavaScript Rendering XHR Request Headers Session Cookie