Nvidia will be well aware of the huge demand that exists for AI-capable hardware. DGX Cloud will now allow customers to run AI workloads for a monthly rent, avoiding the immense cost of a dedicated AI supercomputer.
Nvidia has an issue that it will be happy to face. Demand for its GPUs is huge because of the sustained AI hype, as Nvidia chips are still currently the best performers in that field. Currently, so many companies are clamouring for hardware that the tech giant is struggling to keep up with said demand. As it happens, the company is already the market leader in graphics at a canter, with 90 percent market share in the enterprise sector.
For now, that means many customers have few options for running their AI workloads. On-prem, the older tech is significantly slower, so training larger models can be very time-consuming, inefficient and therefore costly even to run. To avoid being forced to opt for a smaller scale as a result, Nvidia plans to make the most capable AI hardware available on a rental basis.
For the cloud, from Oracle or Nvidia itself
Nvidia itself has strong cloud capabilities, but with the launch of DGX Cloud, it already has a partner of note: Oracle. Those who are customers of that company can tap into thousands of Nvidia GPUs through the Oracle Cloud Infrastructure. DGX Cloud will also become available on Azure and then Google Cloud in due time.
For on-prem customers, a DGX platform was already available. Its latest version deploys eight expensive H100 GPUs that communicate with each other via the super-fast NVLink interconnect. This latest variant has a staggering 32 petaflops of compute performance, according to the GPU giant. DGX Cloud can also cooperate with clusters available locally, but on its own, it is already an AI supercomputer capable of training larger models at pace.
One similarity to the on-prem variant is that with DGX Cloud, customers can tap into than 100 AI frameworks and pre-trained models available on Nvidia’s Enterprise platform. These can be built upon with custom training data or proprietary models. For many companies, it is essential to have as much control over the data set in question as possible. Where an LLM like GPT-4 is a sort of jack-of-all-trades, organizations often prefer a model explicitly trained on relevant data. In addition, banks, healthcare facilities or legal firms, for example, face strict compliance legislation, which means training data must meet requirements imposed by law.
Also read: Nvidia to demand a premium for AI chips