3 min Applications

DeepSeek delayed by GPU export restrictions

DeepSeek delayed by GPU export restrictions

DeepSeek has reportedly stalled in the development of its future R2 model because the company does not have access to sufficient GPUs from Nvidia, according to a report.

The Information cites two anonymous sources familiar with DeepSeek’s efforts. They say the company has been working on the upcoming R2 model for months, but CEO Liang Wengfeng is not yet satisfied with it. However, the company cannot improve the model’s capabilities with the limited number of GPUs at its disposal.

DeepSeek gained notoriety earlier this year when it released its original reasoning model, R1. It proved to be a match for the most advanced models from US companies such as OpenAI, Anthropic, and Meta Platforms. This, despite being built at a fraction of the cost.

According to The Information, DeepSeek trained R1 on a cluster of 50,000 Hopper GPUs, including approximately 10,000 H100s, 10,000 H800s, and approximately 30,000 weaker H20 GPUs, which were developed specifically for the Chinese market.

Secret deliveries

Chinese companies were never legally allowed to purchase the H100 or H800 GPUs. It is believed that some of them were secretly supplied to DeepSeek by investor High-Flyer Capital Management, while others were obtained through shell companies with access to public cloud infrastructure services. The H20 GPUs were obtained legally but are now difficult to obtain due to new US sanctions prohibiting their export to China.

Part of the problem is that many of the H20 GPUs in China are already in use by DeepSeek’s customers. The Information reports that the R1 model has been widely adopted by Chinese companies and government agencies. Most organizations run it on H20 GPUs in the cloud. As a result, there is no capacity left for DeepSeek to train its latest model.

The shortage of H20 GPUs is reportedly already causing problems for R1. This limits how Chinese companies can use it. If the R2 model is a significant improvement over R1, insiders expect demand for the model to exceed what Chinese cloud infrastructures can handle. This is evident from conversations The Information had with DeepSeek employees.

Export rules limit bandwidth

The H20 processor is similar to the H100 GPU that Nvidia sells to Western companies. Its bandwidth and connectivity are limited to comply with previous export rules to China. However, President Donald Trump’s administration decided that even this watered-down chip is too powerful to export to a geopolitical rival and imposed new restrictions in April.

That decision disrupted the plans of Chinese AI developers. Although there are some domestic alternatives, such as Huawei’s Ascend 910B chipset, they are even less powerful than the H20. They also do not support Nvidia’s CUDA software stack, a programming architecture used to optimize applications and AI models for Nvidia GPUs. This is problematic because virtually all Chinese AI developers are believed to use CUDA software.

The Information reports that DeepSeek’s R1 and R2 models have also been optimized for Nvidia chips, and that the inability to obtain them could be a major setback in the effort to keep pace with US competitors.