A token is a small piece of data, usually a short string, that proves who you are or what you're allowed to do when interacting with websites and APIs. Instead of sending your username and password with every request, you send the token. The server recognizes it and treats your scraper like a logged-in user or an approved application.
How tokens work
Here's the basic flow: You authenticate once (through a login form, API key registration, or OAuth flow), and the server generates a token for you. Your scraper then includes this token in headers, cookies, or the request body for all subsequent requests. The server validates the token and grants access until it expires or gets revoked.
Tokens come in two flavors. Stateless tokens (like JWTs) contain all the information the server needs embedded right in the token itself. Stateful tokens are just IDs that map to data stored on the server, like traditional session IDs.
Types of tokens you'll encounter
API tokens (API keys)
These identify and authorize your application when calling an API. You typically get them from a developer dashboard and include them in a header like Authorization: Bearer your_api_key. API keys are common when scraping data from official APIs instead of parsing HTML directly.
Authentication tokens
These prove your identity after you log in. JSON Web Tokens (JWTs) are a popular example. When scraping authenticated content, you first POST credentials to the login endpoint, capture the returned token, then include it in your subsequent requests.
Session tokens
These represent a logged-in session on the server. They usually appear as cookies like sessionid=abc123 that get sent automatically with each request. When you use a session object in your scraper, it stores these cookies and makes every request look like it comes from the same logged-in browser.
Access tokens
Short-lived tokens that grant specific permissions to protected APIs, often used in OAuth 2.0 flows. They include scopes that define exactly what you can access (like read:orders) and have clear expiration times.
CSRF tokens
Hidden fields or headers that protect forms from cross-site attacks. When scraping forms, you need to first GET the page, extract the CSRF token from the HTML, then include it in your POST request.
Using tokens in web scraping
The most common pattern involves logging in and maintaining a session. You use an HTTP client that keeps cookies, perform a login POST, and the response sets a session cookie. All your subsequent requests using that same session see authenticated content.
For API-based scraping, you inspect network traffic in your browser's developer tools to find calls using Bearer tokens. Once you understand how the token gets generated, you recreate that flow programmatically in your scraper.
Best practices for token management
Keep tokens secure by storing them in environment variables or a secrets manager, never in your code or version control. Use HTTPS to protect tokens in transit, and give tokens only the minimum permissions they need.
Build your scraper to handle token expiry gracefully. Watch for 401 or 403 HTTP status codes that signal expired tokens, and implement automatic re-authentication with backoff to avoid getting locked out.
Avoid hammering a site with requests using a single token from multiple IPs or threads. This behavior triggers anti-abuse systems and can get your token revoked.
How Browse AI handles authentication
Managing tokens and authenticated sessions adds complexity to web scraping projects. Browse AI simplifies this by letting you record scraping tasks while logged into websites. The platform handles session management, token refresh, and cookie persistence automatically. You focus on selecting the data you want, and Browse AI takes care of the authentication headaches behind the scenes.

