GPU prices are sky-high, but poised to collapse

Cast AI warns

GPU prices are sky-high, but poised to collapse

According to research by Cast AI, AI chip prices are approaching a tipping point. After several years of steady increases, we can expect a sharp decline in the coming quarters. However, “GPU prices don’t follow logic; GPU availability and pricing are a mess.”

That is the summary of Cast AI co-founder and president Laurent Gil. He explains in a commentary on the newly published research that GPUs are purchased years in advance. As a result, he believes we need to revise our view of the elastic cloud that expands and contracts based on demand. “With such high investment stakes, the industry has effectively reverted to glorified data centers; cloud elasticity is largely an illusion unless you can leverage automation and agents to stay ultra-agile in locating and provisioning what you need.”

Small group of major players

For several years now, individual GPUs have typically been worth tens of thousands of dollars. The price difference between the available variants is partly determined by performance, although virtually every AI chip from Nvidia or AMD immediately finds a buyer. Prices vary greatly by region, Cast AI notes, leading to a highly unpredictable situation. These price differences can also be explained by the fact that some regions simply receive more GPU deliveries. Nvidia’s largest customers are predominantly located on one continent: North America. AWS, Azure, GCP, and “neoclouds” such as CoreWeave and Crusoe are building gigantic “exascale” data centers in the US.

These major players therefore purchase GPUs in advance, sometimes up to three years in advance. They then sell that capacity before their infrastructure even exists, Cast AI points out. The reasoning, the company explains, is that if you don’t buy capacity now, you may not be able to find any later. The dangerous thing about this scenario is that new chip generations quickly push old products out of the market. This keeps the “hype cycle,” as Gil puts it, alive.

Big discount

A lucrative period may be approaching for end users. If they pay attention to the region they choose now, they can save up to 80 percent on GPU rental costs at AWS. Research shows that such drastic price drops have already occurred. For example, an H100 instance in September of this year was 88 percent cheaper than it was in January 2024.

“There is an uncomfortable truth to this: adapt or overpay,” says Gil. For example, those who remain in their own region for sovereignty reasons risk significantly higher prices for running AI workloads. This highlights that the GPU market is currently shaped by the largest companies, which are concluding deals worth tens of billions of dollars. The question is how long this will last. OpenAI, finally free to become a for-profit company after a Microsoft deal this week, is a crucial factor in this. If it fails to increase its revenue by billions upon billions each year, the stock market values of the largest enterprise AI companies are at risk of collapsing. If the AI hype is indeed converted into AI reality any day soon, then the major investments will turn out not to have been for nothing.