Recently, after some wrangling, Nvidia received approval to acquire Israeli-based Run:ai. Now the deal is officially done. Nvidia also plans to make Run:ai’s software open-source. Why is it doing that?
According to insiders, Nvidia paid $700 million for Run:ai, software maker targeting GPU cluster orchestration for AI workloads. The combination is an obvious one: via Run:ai Dev, and the Run:ai API, Control Plane and Cluster Engine, AI workloads can be controlled with a fine comb.
The software currently works only on Nvidia GPUs, as is often the case for enterprise AI software. After all, Nvidia’s AI chips are completely dominant in this area, with only TPUs on Google Cloud and a curiosity cabinet of specialized accelerators providing some variety. The vast majority of the AI ecosystem, however, revolves around CUDA, the architecture behind Nvidia. It just so happens that Run:ai and Nvidia aim to change that.
Conditional approval?
The European Union was the only regulatory body to resist Nvidia’s Run:ai takeover. It’s not clear if anything came of its initial doubts. “Concrete risks to competition” were raised within EU member state Italy, but the Commission ruled over a week ago that there were no major concerns, having further investigated the matter as it normally does.
The approval in and of itself is unsurprising: optimizing Nvidia GPUs can be done in many ways, meaning Run:ai is one of many. Cloud GPU specialists like CoreWeave receive extensive support from Nvidia to maximize their compute for end customers. This is also regularly done based on Ethernet connectivity, not just on Nvidia’s own InfiniBand standard. It goes to show that Nvidia is perfectly willing to play nice with potential rivals as things stand.
Still, it seems that some skepticism from regulators needed to be answered. Now that Run:ai is officially part of Nvidia, the acquired party’s plan is to open-source its own software. “While Run:ai currently supports only NVIDIA GPUs, open sourcing the software will enable it to extend its availability to the entire AI ecosystem.”
Trickier than it looks
The statement above essentially means that those who want it can tie their own AMD and Intel support to the Run:ai stack, can do so at their leisure. That’s happening more often these days, usually to make AMD’s equivalent of CUDA, ROCm, deployable for large-scale AI workloads. As the only serious alternative to Nvidia, AMD Instinct GPUs come with support with some workarounds.
“The reality is that people want to write at higher levels of abstraction,” AMD SVP of AI Vamsi Boppana told The Register. Consider PyTorch, which also provides AMD and Intel with AI frameworks. However, this level of support isn’t as deeply ingrained as it seems, with various plugins and other regularly used tooling missing in action if not on an Nvidia chip. Optimizations for Nvidia (and only Nvidia) are still common among popular AI tooling. From his own experience, James Wang, Creative Ventures General Partner, calls past CUDA alternatives a “pain in the ass.” He compares Nvidia’s control over the AI stack to Apple’s dominance over its own ecosystem. Anyone who takes a look at the historical level of support for Android versions of iOS apps in the early years of the smartphone revolution knows how stark that contrast can be. GPU optimization now for AI workloads is now in a similar phase, with Nvidia being as Apple-like as it can dare to be.
This means that open-sourcing a product like Run:ai poses no problems for Nvidia. It’s a good move not only because it objectively increases choice for developers. Above all, it’s a repeat of what happened with previous AI tooling: the optimization for Nvidia has long since taken place, now it’s up to the alternatives to build a real ecosystem that can compete with it. Run:ai itself, at least, has not seen that need.
Also read: EU approves Nvidia’s acquisition of Run:ai