Introduction

A single browser automation session is straightforward. Scaling to hundreds or thousands of concurrent sessions introduces challenges in resource management, error handling, session lifecycle, and cost optimization. This guide covers the practical patterns for scaling browser automation with BotCloud.

Understanding Resource Consumption

Each browser session consumes resources on both sides of the connection:

Cloud Side (Managed by BotCloud)

RAM: 200-500 MB per session depending on page complexity
CPU: Variable, spikes during page loads and JavaScript execution
Network: Bandwidth for page loads, resource fetches, and proxy traffic

Client Side (Your Infrastructure)

WebSocket connections: One persistent connection per session
Memory: Minimal, just the WebSocket client and your script logic
CPU: Negligible compared to running browsers locally

This asymmetry is the fundamental advantage of cloud browsers: the heavy resource consumption happens on managed infrastructure, while your servers stay lightweight.

Concurrency Patterns

Worker Pool

The most common pattern is a fixed-size worker pool that processes tasks from a queue:

const { Worker } = require('worker_threads');

const CONCURRENCY = 50;
const tasks = [...]; // Your task queue

async function processTask(task) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: `wss://cloud.bots.win?token=${API_KEY}`,
  });

  try {
    const page = await browser.newPage();
    await page.goto(task.url);
    // ... your automation logic
    return result;
  } finally {
    await browser.close();
  }
}

// Process tasks with bounded concurrency
async function runPool() {
  const active = new Set();

  for (const task of tasks) {
    if (active.size >= CONCURRENCY) {
      await Promise.race(active);
    }

    const promise = processTask(task)
      .finally(() => active.delete(promise));
    active.add(promise);
  }

  await Promise.all(active);
}

Dynamic Scaling

For workloads with variable demand, scale concurrency based on queue depth:

function getConcurrency(queueDepth) {
  if (queueDepth > 1000) return 200;
  if (queueDepth > 100) return 50;
  if (queueDepth > 10) return 20;
  return 5;
}

Error Handling at Scale

At scale, errors are not exceptional; they are expected. Network timeouts, page crashes, and proxy failures will happen. Your architecture must handle them gracefully.

Retry with Backoff

async function processWithRetry(task, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await processTask(task);
    } catch (error) {
      if (attempt === maxRetries - 1) throw error;

      const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

Circuit Breaker

If the error rate exceeds a threshold, pause processing to avoid wasting resources:

class CircuitBreaker {
  constructor(threshold = 0.5, window = 60000) {
    this.errors = [];
    this.threshold = threshold;
    this.window = window;
  }

  record(success) {
    this.errors.push({ time: Date.now(), success });
    this.errors = this.errors.filter(e => Date.now() - e.time < this.window);
  }

  isOpen() {
    if (this.errors.length < 10) return false;
    const errorRate = this.errors.filter(e => !e.success).length / this.errors.length;
    return errorRate > this.threshold;
  }
}

Cost Optimization

Session Lifecycle

The most impactful optimization is minimizing session duration. Each second a browser is open costs resources. Close sessions as soon as work is complete.

// Bad: keeping session open between tasks
const browser = await puppeteer.connect({ ... });
for (const url of urls) {
  const page = await browser.newPage();
  await page.goto(url);
  await page.close();
}
await browser.close();

// Better: one session per task, close immediately
for (const url of urls) {
  const browser = await puppeteer.connect({ ... });
  const page = await browser.newPage();
  await page.goto(url);
  await browser.close(); // Release resources immediately
}

Resource Blocking

Block unnecessary resources to speed up page loads and reduce bandwidth:

await page.setRequestInterception(true);
page.on('request', (req) => {
  const type = req.resourceType();
  if (['image', 'media', 'font', 'stylesheet'].includes(type)) {
    req.abort();
  } else {
    req.continue();
  }
});

Monitoring

At scale, visibility is essential. Track these metrics:

Metric	Why It Matters
Active sessions	Capacity utilization
Session duration	Cost per task
Error rate	System health
Queue depth	Scaling trigger
P95 latency	User experience
Tasks per minute	Throughput

Architecture Recommendations

Start with 10-20 concurrent sessions and increase gradually
Use a task queue (Redis, SQS, RabbitMQ) to decouple producers from consumers
Implement circuit breakers to prevent cascade failures
Set hard timeouts on every session (e.g., 60 seconds max)
Log structured data for every session: task ID, duration, success/failure, error details
Monitor costs per task to identify optimization opportunities

#scaling#performance#architecture#production

Share this post

Scaling Browser Automation: From 10 to 10,000 Sessions