DeepSeek says that training its reasoning-focused R1 model cost just $294,000. That amount is significantly below what US companies estimate it would cost, making the discussion about China’s position in the global AI race even more intense.
This was reported by Reuters. In January, DeepSeek already made a name for itself by introducing relatively inexpensive AI systems. That announcement caused turmoil on the stock markets, as investors feared that established players such as Nvidia would lose their lead. Since then, the company has largely kept a low profile, with only a few product updates, while founder Liang Wenfeng has hardly made any public statements.
In an article in Nature, DeepSeek revealed that the R1 training took place on a cluster of 512 Nvidia H800 chips and took a total of 80 hours. This is the first time DeepSeek has shared concrete figures about its training costs. By way of comparison, Sam Altman of OpenAI stated last year that training fundamental models cost more than $100 million, without providing further details.
DeepSeek’s claims raise questions, especially since the H800 chips were designed by Nvidia specifically for the Chinese market after Washington banned the export of the more powerful H100 and A100 chips. US sources previously claimed that DeepSeek had obtained large numbers of H100 chips. However, the company insisted that it had only used H800s. In an additional statement, DeepSeek admitted for the first time that it also owns A100 chips and used them during preliminary experiments with smaller models.
No OpenAI models copied
The relatively low cost of R1 can be partly explained by the model distillation method. In this method, a new model learns from an existing system, so that less computing power is required. American AI experts suggested that DeepSeek may have deliberately copied models from OpenAI. However, the Chinese company emphasizes that distillation is a common technique that enables better performance at lower costs, thereby broadening access to AI.
In addition, DeepSeek indicated that the training data for the V3 model included web pages containing answers from other AI systems. According to the company, this was an unintended side effect of using public sources and not a deliberate attempt to copy competitors’ knowledge. Whether the low costs are actually representative and what impact this will have on international competitiveness remains to be seen.
Also read: DeepSeek prioritizes research over revenue