CoreWeave, a provider of GPU-as-a-Service, is taking a step toward expanding its AI services with the launch of a serverless platform for reinforcement learning. The company wants to help businesses train more complex AI models faster, without having to invest in expensive hardware themselves.
This was reported by The Register. Reinforcement learning, or RL, is a machine learning method in which a model learns by rewarding positive outcomes and punishing mistakes. The technique has recently become popular for refining language models. For example, DeepSeek R1’s reasoning ability was partly achieved through RL techniques.
The new platform, Serverless RL, combines technologies from two recent CoreWeave acquisitions: OpenPipe and Weights & Biases. OpenPipe focused on building RL-based AI agents, while Weights & Biases provides a serverless infrastructure for GPU-accelerated workloads. By combining this knowledge, CoreWeave aims to make model optimization more accessible to a broader business audience.
CoreWeave positions itself as a GPU-as-a-Service provider focused solely on AI computing power. CoreWeave’s infrastructure is heavily focused on Nvidia hardware. This delivers a high degree of efficiency but also makes the platform dependent on a single supplier. The GPU clusters are built around Nvidia’s own network architecture, including InfiniBand support. They are controlled using CoreWeave’s own management software. The company targets customers who want to run large-scale AI training or inference without having to delve into the underlying infrastructure.
The serverless architecture is crucial to the new platform. Workloads are automatically distributed across available GPUs, ensuring that unused capacity is optimally utilized. Many AI applications are also stateless, meaning they do not need to store information between sessions. This allows companies to train their models without managing their own servers or virtual machines. CoreWeave emphasizes that customers pay only for the number of tokens generated during fine-tuning.
Reducing dependence on large customers
The introduction of Serverless RL aligns with CoreWeave’s broader strategy to become less dependent on a handful of large customers. According to the prospectus filed earlier this year, 77 percent of 2024 revenue still came from just two parties. That share appears to be declining, with new customers such as Google and IBM, but the dependency remains high.
At the same time, there is unrest surrounding the GPU provider. Microsoft, a key partner for many years, is reportedly dissatisfied with missed deadlines and delivery problems and has reportedly terminated several agreements. That decision could cost CoreWeave billions, as Microsoft had previously made commitments worth approximately $10 billion over the next five years.
In addition, CoreWeave recently acquired Monolith AI, a company that uses artificial intelligence to accelerate physics and engineering simulations. With this move, the company aims to strengthen its position as a provider of specialized AI infrastructure across a wide range of industries.