Web Scraping Glossary

JavaScript challenge

A JavaScript challenge is a security mechanism that requires browsers to execute code before accessing content, commonly used to block automated scrapers that cannot run JavaScript.

Learn more

Rate limiting

Rate limiting controls how many requests you can send to a website within a specific time window. When scraping, exceeding these limits triggers HTTP 429 errors or IP bans.

Learn more

CAPTCHA is a security test that distinguishes humans from bots by presenting challenges like distorted text or image selection. It protects websites from spam and unauthorized automation while creating obstacles for web scraping operations.

Learn more

Public data

Public data is any information freely accessible on websites without requiring login credentials or special permissions, forming the foundation of ethical web scraping practices.

Learn more

Password-protected content

Password-protected content is any web page requiring authentication before access. Learn how web scrapers handle logins, sessions, and the challenges of extracting data from authenticated pages.

Learn more

Token

A token is a small string that proves your identity or permissions when interacting with websites and APIs, replacing the need to send credentials with every request.

Learn more

Session

A session is a continuous interaction between your scraper and a website where the server remembers you across multiple requests. Sessions let you stay logged in, access user-specific content, and avoid detection by maintaining realistic browsing patterns.

Learn more

Authentication

Authentication verifies your identity before accessing protected content on websites. Learn how authentication works in web scraping, including different methods like API keys, OAuth, CSRF tokens, and session cookies, plus common challenges you'll face when scraping authenticated sites.

Learn more

Session cookie

A session cookie is a temporary identifier that websites use to remember you during a single browsing session. It lives in your browser's memory and disappears when you close the browser, enabling features like login persistence and shopping carts.

Learn more

URL parameters

URL parameters are pieces of information added to web addresses that control what content gets displayed. They're essential for web scraping because they let you filter, sort, and navigate through data programmatically without manual clicking.

Learn more