Vultr has launched a new architecture for GPU-driven data centers. AMD, Broadcom and Juniper Networks are the leading partners for the brand new concept. It shows Vultr’s competitors that there is life beyond Nvidia in the GenAI field.
We’ve recently discussed several AI infrastructure providers, or parties providing cloud resources with an emphasis on GPU compute. For example, DataCrunch is a promising European player, while American CoreWeave has attracted the interest of Cisco and Pure Storage. The big difference is that one has to strain to not become an Nvidia customer via such AI hyperscalers. In that area, Vultr is now clearly delivering an alternative.
GPUs as the hub of the data center
Now that GPU data centers are a reality, companies like Vultr have to find the right partners to maximize performance. For example, Broadcom and Juniper were chosen for their Ethernet technology and networking offerings. Admittedly, there is a lot of complex work to be done there, but ultimately, a GPU data center is about feeding data as quickly as possible to the most critical component: the AI chip. Usually that’s a GPU, though not always.
More often than not, the choice in that particular area is minimal. Nvidia is available anytime, anywhere as a choice for remote GPU compute, anywhere in the world. End users expect it, suppliers have to deliver it. If it can’t be delivered through a Preferred Partner like Vultr, CoreWeave, or DataCrunch, then Nvidia is also a cloud option. However, the software and CUDA tooling ecosystem is closed off and exclusive. So, for end users, they are regularly tied to Nvidia’s AI platform.
Also read: AMD’s Epyc rise in servers accompanies its wider growth
Open standards
ROCm, the open-source AI stack alternative to Nvidia’s, is the foundation for Vultr’s new architecture. It is such an open standard that even Intel, should its AI GPU division ever become competitive, could be pushed forward as a chip supplier. Vultr, therefore, still has its options open. However, it is a sign of confidence in AMD’s MI300X chip that it chose to go with a completely new architecture to deliver to customers. The next few years will presumably see the move to MI325X, the latest model, and MI355X, which will follow even later.
“Open ecosystems are the foundation of innovation,” said J.J. Kardwell, CEO of Vultr. “Our collaboration with AMD, Broadcom, and Juniper Networks empowers enterprises and AI innovators to harness the full potential of accelerated computing with the highest levels of flexibility, scalability, interoperability, and security.”
Vultr’s first AMD-powered supercomputing cluster will land in Chicago. It will be a matter of time before the rest of the world has an AMD cluster nearby. Ethernet has long since won the battle from Nvidia’s InfiniBand for AI connectivity, a major achievement in this regard, as it has prevented a total lock-in for AI infrastructures.
“Ethernet has become the de facto technology for backend networks in large-scale AI deployments. Broadcom’s leading switch silicon and network adapters are accelerating such networks to ever higher performance,” said Ram Velaga, Senior Vice President and General Manager, Core Switching Group, Broadcom. “We are proud to work with AMD, Juniper and Vultr to power this supercompute cluster based on open Ethernet networking.”
Also read: Avalanche of new AMD products: Epyc, Instinct, Ryzen and more