Web Scraping Glossary

Dynamic content

Dynamic content loads after the initial HTML response using JavaScript. It creates challenges for web scraping because traditional methods only capture the basic HTML skeleton, missing most of the actual data.

Learn more

JavaScript

JavaScript makes websites interactive by running code directly in your browser. For web scraping, it creates challenges because many sites use JavaScript to load content dynamically, requiring special tools like headless browsers to extract data properly.

Learn more

HTTP request

An HTTP request is a message sent to a web server asking for specific information. Learn how HTTP requests work in web scraping, including methods like GET and POST, essential headers, request bodies, and query parameters.

Learn more

XPath

XPath is a query language that lets you navigate and extract data from HTML and XML documents by specifying paths to elements. It's one of the most powerful tools for web scraping because it enables precise targeting of specific elements.

Learn more

CSS selector

A CSS selector is a pattern that targets specific HTML elements on a web page. In web scraping, CSS selectors act as precise instructions that tell your scraper exactly which elements to extract data from.

Learn more

Status code

A status code is a three-digit number that tells you whether your web request succeeded or failed. Learn how to interpret and handle status codes to build reliable scrapers.

Learn more

HTTP response

An HTTP response is the data a web server sends back after receiving a request. It contains status codes, headers, and the HTML body that scrapers parse to extract data from websites.

Learn more

DOM (Document Object Model)

The DOM (Document Object Model) is a programming interface that represents HTML as a tree structure of objects. It's the live, interactive model your browser creates from HTML code, and it's what web scrapers extract data from.

Learn more

HTML

HTML (Hypertext Markup Language) is the standard language for creating web pages. It uses tags to structure content, and in web scraping, HTML is the raw material you parse and extract data from.

Learn more

Web crawler

A web crawler is a bot that systematically browses the internet by following links from page to page, discovering and mapping content across websites. Crawlers find what pages exist, while scrapers extract specific data from those pages.

Learn more