The ₹300 Headless Chrome Pool That Saved My Demos (and Why It Needs Babysitting)

How I replaced flaky screenshot jobs with a tiny headless‑Chrome pool on a ₹300 VPS, the one upgrade that broke everything, and the few rules I now live by.

Written by: Arjun Malhotra

Open laptop on a desk showing code and a coffee cup beside it
Photo by Clem Onojeghuo on Unsplash

It was 11:47 a.m. on a Tuesday and I was supposed to show screenshots of our checkout flow to the product team in a demo at noon. CI had mysteriously started timing out on the visual-tests job two days earlier. The runner logs said “Chrome crashed”, then “OOM killed”, then nothing. Someone was going to ask why staging looked different from production. I opened the console, felt my stomach drop, and started a manual browser session. Ten minutes later I still hadn’t finished capturing the three screen states I needed.

That afternoon I stopped babysitting other people’s machines and built something I could control: a tiny pool of headless Chrome instances on a ₹300/month VPS. It now handles our screenshots, PDF generation, and a few flaky visual tests. It saved me from a handful of last‑minute panics. It also taught me how fragile an apparently simple infra shortcut can be.

Why I needed a pool Our problems were boring but real: CI runners are ephemeral, have small disks, and when dozens of jobs start Chrome at once they compete for RAM. We saw crashes, flakiness, and longed-for stability during demos. Running Chrome inside every test job also burned our CI minutes — a real cost for a small startup where every ₹ counts.

A central pool gives me:

How I built it (the fast, practical version) I had three hard constraints: keep it cheap, make it resilient to slow office networks, and avoid bloated tools.

What I run: a small VPS (₹300/mo), Docker, a Redis queue, and N worker containers (N=3 by default). Each worker runs Chrome in headless mode inside a lightweight Docker image (Debian slim + Chrome) and exposes a JSON-over-HTTP RPC for jobs: navigate, waitForSelector, setViewport, captureScreenshot, return base64. A tiny Node.js dispatcher takes requests from Redis, hands them to an available worker, and retries on failure.

Key practical bits:

Why this is better than “just run Chrome in CI”

The week it broke me I said earlier: pin the Chrome binary. I didn’t do that strictly enough. One morning an unattended apt upgrade on the VPS pulled a new Chrome. The next CI run showed baseline screenshots with slightly different font rendering and spacing. The visual tests exploded. Worse, our dispatcher continued to hand out screenshots that looked “fine” to me locally — because my laptop Chrome was still the old version. The team spent a day chasing CSS kerning and blaming frontend engineers. I spent that day rolling back the VPS image and admitting I’d been sloppy.

That was my real failure: I cared about speed and simplicity more than reproducibility. The rollback cost me a full day of team goodwill and a new rule: immutable images + automated image builds. Now the VPS uses a pinned image built by our CI pipeline; the dispatcher refuses to accept worker registrations from untrusted Chrome versions.

Tradeoffs and real limitations

What I learned and what I actually take away The pool is not about reducing work — it’s about shifting where the work lives. I traded flaky, repeated CI failures for a small, predictable maintenance task and one server that I own. That swap makes me calm before demos. It also forced better habits: pin binaries, bake images in CI, and never let sensitive data touch the renderer.

If you’re tempted to replicate this:

One takeaway to leave you with: if your tests are unstable because of the environment rather than your code, invest in a small, stable environment you control. It won’t be free, but you’ll stop explaining flaky demos to product every sprint.