What Is GPU Compute (in Plain English) and Why Prices Swing So Much
GPU compute is the processing power that graphics cards provide when they are put to work on calculations beyond displaying images. Think of your GPU as having thousands of small calculators running at once, making it perfect for tasks that need many simple calculations done simultaneously—like training artificial intelligence models, analyzing large datasets, or running complex simulations. Unlike a CPU that has a few powerful cores designed for complex tasks, GPUs excel at parallel processing—doing many simple calculations at the same time.
When companies need GPU compute, they typically buy it by the hour. A single NVIDIA H100 GPU, one of today's most powerful AI training chips, costs around $4.10-$4.80 per hour to rent according to Ornn Compute Exchange's H100 Index. To put this in perspective, training a large language model like GPT-4 required approximately 6.4 million GPU-hours, costing an estimated more than $150 million. Even smaller tasks add up quickly—fine-tuning a 7-billion-parameter model might consume 1,000-5,000 GPU-hours, translating to more than $15,000 in compute costs.
GPU prices are notoriously volatile for three main reasons.
- •Hardware supply constraints: GPU production involves complex semiconductor manufacturing with long lead times. Shortages of critical components, packaging bottlenecks, or logistics issues can squeeze supply and push prices up dramatically.
- •Demand spikes from AI model launches: when companies like OpenAI or Google release new models, the scramble to train or fine-tune competitors builds acute demand pressure on available inventory.
- •Infrastructure constraints: beyond the chips themselves, GPU deployments require specialized cooling, high-speed networking, and massive power supplies—data centers that can deliver that stack charge premiums, while energy costs and facility limits add another layer of volatility.
Mini-Glossary
- •GPU-hour: One graphics processing unit running for one hour, the basic unit for measuring and pricing compute time
- •Utilization: The percentage of time a GPU spends actively computing versus sitting idle, directly impacting cost-effectiveness
- •Forward contracts: Contracts that lock in rates for months ahead, often reducing costs relative to volatile on-demand pricing when workloads are predictable