Status code - Glossary

What is a status code?

A status code is a three-digit number that tells you whether your web request succeeded or failed. When your scraper sends a request to a website, the server responds with a status code that acts like a traffic signal. It determines whether you got the data you wanted, need to try again, or should stop and adjust your approach.

Status codes are split into five families, each with a different meaning. The first digit tells you the category: 1xx means the server is still processing, 2xx means success, 3xx means the content moved somewhere else, 4xx means your request had a problem, and 5xx means the server had an issue.

Why status codes matter in web scraping

Status codes directly affect your scraping budget, data quality, and bot detection risk. Most scraping services only charge you for successful 200 responses, so understanding status codes helps you avoid wasting credits on failed requests.

When you ignore status codes, you'll end up parsing error pages as data, hitting rate limits that get your IPs banned, and retrying requests that will never work. Each status code requires a different response strategy. A 404 means the page doesn't exist and retrying won't help. A 429 means you're going too fast and need to slow down or rotate proxies. A 503 might resolve in a few minutes, making it worth a retry.

Common status codes you'll encounter

200 OK

This is what you want to see. The server processed your request and sent back the content you asked for. When you get a 200, you can parse the response and extract your data. However, some websites return 200 status codes even when they're blocking you, so always validate that the response contains actual content and not a ban page or CAPTCHA.

404 Not found

The page doesn't exist on the server. This could mean the URL is wrong, the page was deleted, or your scraper is building URLs incorrectly. You shouldn't retry 404 errors since the page genuinely doesn't exist. In some cases, 404 responses are valuable data points that tell you when products are discontinued or content is removed.

429 Too many requests

The server is rate limiting you because you're sending too many requests too quickly. This is one of the most critical codes for scrapers. When you hit a 429, slow down your request rate, add delays between requests, or rotate to different IP addresses. Some servers include a Retry-After header that tells you exactly how long to wait before trying again.

403 Forbidden

The server understood your request but refuses to fulfill it. This often means the website detected your scraper and blocked you. You might have missing or suspicious headers, be using a datacenter IP address, or your request pattern looks like a bot. Try rotating user agents, using residential proxies, or adding cookies to make your requests look more legitimate.

500 Internal server error

Something broke on the server side. This isn't your fault, and the issue is usually temporary. Wait a few seconds and retry the request. If 500 errors persist, the website might be having infrastructure problems and you should try again later.

503 Service unavailable

The server is temporarily overloaded or down for maintenance. If you get clusters of 503 errors at specific times, schedule your scraping jobs during off-peak hours. Implement retry logic with exponential backoff and random jitter to avoid hammering an already struggling server.

301 and 302 Redirects

The content moved to a different URL. A 301 means the move is permanent, while a 302 is temporary. Most scraping libraries automatically follow redirects, but you should watch for redirect loops and verify that redirects don't lead you off the target domain.

How to handle status codes in your scraper

Build different response strategies based on status code families. For 2xx codes, parse and extract the data. For 3xx codes, follow the redirect but watch for infinite loops. For 4xx codes, log the error and move on since retrying won't help (except for 429, which needs rate limiting). For 5xx codes, implement exponential backoff and retry a few times before giving up.

Set up logging that captures the URL, status code, timestamp, and any retry attempts. This helps you spot patterns like certain pages always returning 404s or specific time windows producing 503 errors. Monitor your status code distribution to catch problems early. If your 4xx error rate suddenly spikes, you might have a detection problem that needs fixing.

Create alert thresholds based on error rates. If more than 5% of requests return 4xx errors, pause the scraper and investigate. If 5xx errors exceed 10%, the target site might be having infrastructure issues and you should reschedule your job.

How Browse AI handles status codes

When you use Browse AI for web scraping, the platform automatically handles status codes for you. If a page returns a 429 or 503, Browse AI implements retry logic with smart backoff strategies. The platform rotates through different browser fingerprints and IP addresses to avoid detection, which helps prevent 403 errors. You get clean, structured data without worrying about parsing error pages or managing retry logic yourself. Browse AI's monitoring dashboard shows you which requests succeeded and which failed, so you can identify problematic pages without sifting through raw status codes.