User agent

A user agent is a text string that identifies your browser or scraping tool to web servers. Websites use user agents to detect bots, making proper user agent configuration essential for successful web scraping.

What is a user agent?

A user agent is a text string that your browser or scraping tool sends to a web server with every HTTP request. It identifies what software you're using to access the site, including your browser name, version, operating system, and sometimes your device type.

When you visit a website, your browser automatically includes this user agent string in the request headers. The server reads it to understand what kind of client is making the request. Originally, servers used this information to optimize content for different browsers and devices. Today, they also use it to detect and block bots.

For web scraping, user agents matter because websites analyze them to figure out if you're a real person using a browser or an automated script. Get it wrong, and you'll get blocked before you extract a single piece of data.

What a user agent string looks like

A typical user agent string follows a standard pattern that includes several components:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Breaking this down: Mozilla/5.0 appears in almost every modern user agent string for historical reasons. Windows NT 10.0 indicates the operating system. AppleWebKit/537.36 and KHTML, like Gecko tell the server about the rendering engine. Chrome/120.0.0.0 specifies the browser and version. Safari/537.36 shows compatibility information.

Different browsers and devices send different strings. A mobile Safari browser on an iPhone sends a completely different user agent than Chrome on Windows, which helps servers deliver the right version of their content.

Why user agents matter for web scraping

Websites use user agent analysis as a primary defense against automated scraping. If your scraper doesn't send a user agent, sends a fake-looking one, or sends one that matches known scraping tools, the website will likely block your request or serve you a CAPTCHA.

Most web scraping libraries send their own default user agents that immediately identify them as bots. Python's requests library sends something like python-requests/2.28.1, which tells the server you're running an automated script. Websites block these instantly.

Using a proper user agent makes your scraper look like a regular browser visit. This dramatically increases your success rate because servers treat your requests as legitimate traffic instead of bot activity.

Beyond avoiding blocks, user agents also affect what content you receive. Some websites serve different versions of their pages based on the user agent. A mobile user agent might get a simplified mobile layout, while a desktop browser agent gets the full desktop version. If you're scraping product data that only appears in the desktop view, you need to send a desktop user agent.

Common user agent examples

The most common user agents represent popular browsers on widely-used operating systems. These are the safest choices for web scraping because they blend in with normal traffic:

Chrome on Windows: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Chrome on macOS: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Safari on macOS: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15

Firefox on Windows: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0

Edge on Windows: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0

You can find your own browser's user agent by opening Developer Tools (press F12), going to the Network tab, reloading the page, and checking the User-Agent header in any request.

How to set user agents in web scraping

Setting a user agent is straightforward because it's just another HTTP header. You add it to your request headers when making HTTP calls.

The key is choosing a realistic, current user agent that matches a popular browser. Avoid outdated browser versions or unusual combinations that might trigger anti-bot systems. Pick user agents from browsers that still have significant market share and update them periodically as new browser versions release.

User agent rotation

For large-scale scraping, using a single user agent isn't enough. Websites track patterns, and thousands of requests from the same user agent looks suspicious. User agent rotation solves this by varying the user agent string across your requests, making your traffic appear to come from multiple different browsers instead of one automated source.

The simplest approach is random rotation. You maintain a list of realistic user agents and randomly select one for each request. This spreads your scraping activity across different browser profiles.

A more sophisticated approach uses weighted randomization based on actual browser market share. If Chrome has 65% market share and Firefox has 15%, you'd select Chrome user agents 65% of the time and Firefox 15% of the time. This makes your traffic patterns match real user demographics.

When rotating user agents, make sure they're all realistic and current. Mixing in outdated browser versions or obscure clients defeats the purpose because those still stand out as suspicious.

User agent best practices

Always use user agents from real, current browsers. Websites flag requests with outdated or non-standard user agents as suspicious. Chrome 95 looked normal in 2021 but stands out as outdated today.

Match your user agent to your target audience. If you're scraping a mobile-focused site, include mobile user agents in your rotation. For desktop-focused sites, stick with desktop browsers. This keeps your traffic patterns aligned with what the site normally sees.

User agents work best when combined with other anti-detection techniques. Pair them with proxy rotation to vary your IP addresses, use realistic request timing to avoid sending thousands of requests per second, and handle cookies properly to maintain session consistency.

Keep your user agent pool updated. Browser versions change every few weeks. What looks normal today will look suspicious in six months. Review and update your user agent strings quarterly to reflect current browser releases.

Don't rely solely on user agents. They're one piece of the puzzle, but websites also check your IP address, request headers, browser fingerprints, and behavior patterns. A perfect user agent won't help if you're sending 100 requests per second from a single IP.

How Browse AI handles user agents

Setting up user agent rotation manually requires maintaining lists of current user agents, implementing rotation logic, and keeping everything updated as browsers release new versions. This adds complexity to every scraping project.

Browse AI handles user agents automatically. The platform sends realistic browser user agents with every request, so websites see your scrapers as legitimate browser traffic. You don't need to research current user agents, write rotation logic, or update strings as browsers evolve.

This automatic handling is part of Browse AI's broader approach to making scraping work reliably without technical overhead. The platform manages the technical details like user agents, JavaScript rendering, and session handling, letting you focus on what data you need instead of how to avoid getting blocked.

Table of content