Scaling Browser Automation: From 10 to 10,000 Sessions
Practical guide to scaling browser automation in the cloud, covering concurrency management, resource optimization, and architecture patterns.
Introduction
A single browser automation session is straightforward. Scaling to hundreds or thousands of concurrent sessions introduces challenges in resource management, error handling, session lifecycle, and cost optimization. This guide covers the practical patterns for scaling browser automation with BotCloud.
Understanding Resource Consumption
Each browser session consumes resources on both sides of the connection:
Cloud Side (Managed by BotCloud)
- RAM: 200-500 MB per session depending on page complexity
- CPU: Variable, spikes during page loads and JavaScript execution
- Network: Bandwidth for page loads, resource fetches, and proxy traffic
Client Side (Your Infrastructure)
- WebSocket connections: One persistent connection per session
- Memory: Minimal, just the WebSocket client and your script logic
- CPU: Negligible compared to running browsers locally
This asymmetry is the fundamental advantage of cloud browsers: the heavy resource consumption happens on managed infrastructure, while your servers stay lightweight.
Concurrency Patterns
Worker Pool
The most common pattern is a fixed-size worker pool that processes tasks from a queue:
const { Worker } = require('worker_threads');
const CONCURRENCY = 50;
const tasks = [...]; // Your task queue
async function processTask(task) {
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://bots.win/ws?apiKey=${API_KEY}`,
});
try {
const page = await browser.newPage();
await page.goto(task.url);
// ... your automation logic
return result;
} finally {
await browser.close();
}
}
// Process tasks with bounded concurrency
async function runPool() {
const active = new Set();
for (const task of tasks) {
if (active.size >= CONCURRENCY) {
await Promise.race(active);
}
const promise = processTask(task)
.finally(() => active.delete(promise));
active.add(promise);
}
await Promise.all(active);
}
Dynamic Scaling
For workloads with variable demand, scale concurrency based on queue depth:
function getConcurrency(queueDepth) {
if (queueDepth > 1000) return 200;
if (queueDepth > 100) return 50;
if (queueDepth > 10) return 20;
return 5;
}
Error Handling at Scale
At scale, errors are not exceptional; they are expected. Network timeouts, page crashes, and proxy failures will happen. Your architecture must handle them gracefully.
Retry with Backoff
async function processWithRetry(task, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await processTask(task);
} catch (error) {
if (attempt === maxRetries - 1) throw error;
const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
await new Promise(r => setTimeout(r, delay));
}
}
}
Circuit Breaker
If the error rate exceeds a threshold, pause processing to avoid wasting resources:
class CircuitBreaker {
constructor(threshold = 0.5, window = 60000) {
this.errors = [];
this.threshold = threshold;
this.window = window;
}
record(success) {
this.errors.push({ time: Date.now(), success });
this.errors = this.errors.filter(e => Date.now() - e.time < this.window);
}
isOpen() {
if (this.errors.length < 10) return false;
const errorRate = this.errors.filter(e => !e.success).length / this.errors.length;
return errorRate > this.threshold;
}
}
Cost Optimization
Session Lifecycle
The most impactful optimization is minimizing session duration. Each second a browser is open costs resources. Close sessions as soon as work is complete.
// Bad: keeping session open between tasks
const browser = await puppeteer.connect({ ... });
for (const url of urls) {
const page = await browser.newPage();
await page.goto(url);
await page.close();
}
await browser.close();
// Better: one session per task, close immediately
for (const url of urls) {
const browser = await puppeteer.connect({ ... });
const page = await browser.newPage();
await page.goto(url);
await browser.close(); // Release resources immediately
}
Resource Blocking
Block unnecessary resources to speed up page loads and reduce bandwidth:
await page.setRequestInterception(true);
page.on('request', (req) => {
const type = req.resourceType();
if (['image', 'media', 'font', 'stylesheet'].includes(type)) {
req.abort();
} else {
req.continue();
}
});
Monitoring
At scale, visibility is essential. Track these metrics:
| Metric | Why It Matters |
|---|---|
| Active sessions | Capacity utilization |
| Session duration | Cost per task |
| Error rate | System health |
| Queue depth | Scaling trigger |
| P95 latency | User experience |
| Tasks per minute | Throughput |
Architecture Recommendations
- Start with 10-20 concurrent sessions and increase gradually
- Use a task queue (Redis, SQS, RabbitMQ) to decouple producers from consumers
- Implement circuit breakers to prevent cascade failures
- Set hard timeouts on every session (e.g., 60 seconds max)
- Log structured data for every session: task ID, duration, success/failure, error details
- Monitor costs per task to identify optimization opportunities