← Home

I built a GPU farm before anyone called it AI infrastructure

2016. Ethereum was new. Most people hadn't heard of it. I saw an opportunity and built a liquid-cooled GPU mining pool from scratch. Not buying premade miners. Designing the whole thing. The physical layout. The airflow. The cooling loops. The power distribution. The monitoring.

Building the rigs

Custom rigs. Open-air frames because cases would just trap heat. Multiple GPUs per board, each one pulling 200+ watts. The power draw was serious. I had to think about circuits, breakers, and what happens when you flip everything on at once. Inrush current is not a theoretical problem when you have a room full of GPUs.

The cooling was custom loops. Reservoirs, pumps, radiators, tubing runs planned to minimize bends. Quick-disconnect fittings because cards fail and you need to swap them without draining the whole system. Leak detection sensors under every rack because one drip at 3am can turn expensive hardware into expensive paperweights.

Running the operation

Managing hash rates. Pool economics. Power costs per kilowatt-hour versus expected returns. Overclocking memory while undervolting cores because mining workloads care more about memory bandwidth than clock speed. Finding the sweet spot between performance and power draw. Monitoring every card's temperature, fan speed, and hash rate around the clock.

Failover mattered. If a rig went down, you lost money every minute it was offline. So I built monitoring that would alert on hash rate drops, temperature spikes, or cards falling off the bus. Auto-restart scripts. Watchdog timers. The kind of operational discipline that people now associate with SRE practices.

What I learned

I learned more about GPU compute in those months than any certification would teach. How GPUs actually work under sustained load. How cooling systems fail. How power distribution works at scale. How to monitor a fleet of machines and respond to failures before they cascade.

The rigs are long gone. The knowledge stuck. When I design AI infrastructure today, I'm not starting from a vendor whitepaper. I'm starting from the feeling of a coolant leak at 3am and the sound of a pump dying under load. That kind of knowledge doesn't come from reading. It comes from building.