Request headers

Request headers are key-value pairs sent with HTTP requests that identify your client and preferences. In web scraping, proper headers help your requests appear legitimate and control the data format you receive.

Request headers are key-value pairs sent along with every HTTP request that tell a web server who you are, what you want, and how you expect the response. When you visit a website, your browser automatically sends these headers. When you scrape a website, you need to send them yourself or your requests will look suspicious and get blocked.

What are request headers?

Every HTTP request has three main parts: the URL, the method (like GET or POST), and the headers. Request headers act like an introduction to the server. They describe your client software, preferred content format, language settings, and authentication credentials.

For web scraping, headers serve two critical purposes. First, they help your scraper look like a legitimate browser instead of an automated bot. Second, they let you control what data you receive back, whether that is HTML, JSON, or content in a specific language.

Common request headers you need to know

User-Agent

The User-Agent header identifies your client software. Browsers send detailed strings that include the browser name, version, and operating system. Most scraping libraries send obvious defaults like "python-requests/2.x" which websites can easily block.

Setting a realistic User-Agent is the most basic step to avoid detection. Without it, many sites will reject your requests outright.

Accept

This header tells the server what content types you can handle. Browsers typically request HTML, but if a site has an API, you can request JSON instead with "application/json". JSON responses are cleaner and easier to parse than HTML.

Cookie

Cookies carry session data, login status, and preferences between requests. HTTP is stateless, meaning each request is independent. Cookies bridge that gap by letting the server recognize you across multiple requests.

You need cookies to access pages behind logins, handle consent popups, and maintain sessions while paginating through results. Some anti-bot systems also set cookies that must be returned on subsequent requests.

Referer

The Referer header shows the URL of the page that led to your current request. When you click a link, your browser automatically sends this. Many sites check for a valid internal Referer and block requests that appear to come from nowhere.

Authorization

This header carries credentials for protected APIs. The most common format is "Bearer" followed by an access token. Without the correct Authorization header, you will get 401 errors or limited data from authenticated endpoints.

Why headers matter for web scraping

Anti-bot systems inspect headers closely. Missing or misconfigured headers are one of the easiest ways to identify automated traffic. A request without a User-Agent or with mismatched headers stands out immediately.

Beyond avoiding blocks, headers affect what data you receive. The Accept-Language header determines the language of content. Accept-Encoding controls compression. Getting these right means cleaner data with fewer parsing surprises.

Best practices for using request headers

Start by copying headers from a real browser. Open your browser's developer tools, make a request, and examine the headers sent. Use these as your template.

Rotate User-Agent strings when scraping at scale, but keep them consistent within a session. Random headers on every request look more suspicious than a single realistic value.

Use a session object in your scraping library to handle cookies automatically. This saves you from manually tracking and sending cookie values between requests.

Set the Referer header to simulate natural navigation. When scraping a product detail page, set the Referer to the category listing page where that product appeared.

Request JSON when APIs are available. Many modern websites load data through internal APIs. Finding and using these endpoints with the right Accept header gives you structured data without parsing HTML.

How Browse AI can help

Configuring headers correctly requires trial and error, especially when dealing with sophisticated anti-bot systems. Browse AI handles all of this complexity for you. The platform uses real browsers that automatically send proper headers, manage cookies, and maintain realistic sessions. You simply point and click to select the data you want, and Browse AI takes care of the technical details that make scraping work reliably.

Table of contents