Docker Browser Automation: Containerized Scaling Guide
How to run browser automation in Docker containers with proper resource allocation, shared memory, and process management.
Introduction
Docker containers are the standard deployment unit for browser automation at scale. They provide isolation, reproducible environments, and horizontal scaling. However, browsers are resource-intensive applications with specific requirements around shared memory, display servers, and process management that require careful container configuration.
Docker Configuration Essentials
Shared Memory
Chrome uses shared memory (/dev/shm) for inter-process communication. The default Docker shared memory size (64 MB) is too small and causes Chrome to crash:
services:
worker:
image: your-automation-image
shm_size: '2gb' # Required for Chrome
Or with docker run:
docker run --shm-size=2g your-automation-image
Dockerfile
A minimal Dockerfile for browser automation with BotCloud:
FROM node:20-slim
# Install only essential system dependencies
RUN apt-get update && apt-get install -y \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
CMD ["node", "worker.js"]
Since BotCloud runs browsers in the cloud, you do not need Chrome or its system dependencies in your container. The container only needs Node.js and your automation code.
Resource Limits
Set memory and CPU limits to prevent runaway containers:
services:
worker:
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
Docker Compose for Multi-Worker
version: '3.8'
services:
worker:
build: .
shm_size: '2gb'
environment:
- BOTCLOUD_API_KEY=${BOTCLOUD_API_KEY}
- CONCURRENCY=5
deploy:
replicas: 4
resources:
limits:
cpus: '2'
memory: 2G
restart: unless-stopped
volumes:
- ./data:/app/data
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
Worker Pattern
A production worker that processes tasks from a queue:
const puppeteer = require('puppeteer-core');
const CONCURRENCY = parseInt(process.env.CONCURRENCY || '5');
const API_KEY = process.env.BOTCLOUD_API_KEY;
async function processTask(task) {
const browser = await puppeteer.connect({
browserWSEndpoint:
`wss://bots.win/ws?apiKey=${API_KEY}&proxy=${encodeURIComponent(task.proxy)}`,
});
try {
const page = await browser.newPage();
await page.goto(task.url, { timeout: 30000 });
const result = await page.evaluate(task.extractScript);
return { taskId: task.id, status: 'success', data: result };
} catch (error) {
return { taskId: task.id, status: 'error', error: error.message };
} finally {
await browser.close();
}
}
async function worker() {
while (true) {
const tasks = await fetchTasks(CONCURRENCY);
if (tasks.length === 0) {
await new Promise(r => setTimeout(r, 5000));
continue;
}
const results = await Promise.allSettled(
tasks.map(task => processTask(task))
);
await reportResults(results);
}
}
// Graceful shutdown
process.on('SIGTERM', () => {
console.log('Received SIGTERM, finishing current tasks...');
// Set flag to stop accepting new tasks
// Wait for current tasks to complete
// Then exit
});
worker().catch(console.error);
Health Checks
Add health checks to detect stuck containers:
services:
worker:
healthcheck:
test: ["CMD", "node", "healthcheck.js"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
// healthcheck.js
const http = require('http');
http.get('http://localhost:3001/health', (res) => {
process.exit(res.statusCode === 200 ? 0 : 1);
}).on('error', () => process.exit(1));
Scaling Strategies
Horizontal Scaling
Increase the number of container replicas:
docker compose up --scale worker=10
Kubernetes
For larger deployments, use Kubernetes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: automation-worker
spec:
replicas: 10
template:
spec:
containers:
- name: worker
image: your-automation-image
resources:
limits:
memory: "2Gi"
cpu: "2000m"
env:
- name: BOTCLOUD_API_KEY
valueFrom:
secretKeyRef:
name: botcloud-secrets
key: api-key
Best Practices
- Always set
--shm-size=2geven when using cloud browsers (Node.js may need it for WebSocket buffers) - Use
--initflag or tini to handle zombie processes - Set resource limits to prevent one container from starving others
- Implement health checks to detect and restart stuck workers
- Log to stdout/stderr for Docker's log aggregation
- Handle SIGTERM for graceful shutdown during rolling deployments
- Use secrets management for API keys, not environment variables in compose files