2 min

Tags in this article

, ,

With AI-based services in demand, their cost will have disappointed many companies. That needs to change, is what Ampere Computing and French cloud provider Scaleway believe. With “cost optimized” (COP) Arm instances, organizations could deploy AI workloads more cost-effectively than with the Nvidia alternative.

Scaleway unveiled the new servers at its ai-PULSE event in Paris. The COP Arm instances run on Ampere Altra chips. The new offerings are “tailored to meet the demands of AI-driven applications,” such as running a chatbot or analyzing large amounts of data. Note that it does not specialize in training models with the most demanding performance requirements for AI workloads.

French-based Scaleway has been around since 1999 and operates in Paris, Amsterdam and Warsaw. It serves 25,000 customers and already has several cloud- and AI-focused options. For example, it also works with Nvidia to make AI hardware available in the cloud.

Efficient (but not a powerhouse)

Because training a model is a one-time process, Ampere CEO Jeff Wittich argues that performance in that area is less relevant than elsewhere. “In fact, general-purpose CPUs are good at inference, and they always have been,” Wittich said. “Inference is your scale model that you’re running all the time, so efficiency is more important here.”

And in that area, they promise to use up to 3.6 times less electricity per inferencing workload than the Nvidia alternative. In short, the magic word is efficiency over raw performance. It should be noted that the example they use (inferencing for Whisper, an AI tool for speech recognition from OpenAI) was tested on an Nvidia A10 GPU. That chip is now over two years old and has long since been replaced by more modern variants. For example, the Nvidia L40S is based on the much more efficient Ada Lovelace architecture, in addition to significant performance improvements. The problem: given the current shortages of Nvidia chips, they cost and fortune. Not just to buy, but also to rent such performance on a cloud basis.

At least Ampere does not shy away from firm promises: in conversation with The Register, the company cites the CEO of France’s Lampi.ai, who stated that COP-Arm is 10 times faster for a tenth of the cost compared to x86 competition.

Also read: Store your AI workloads in Google Cloud’s customized cloud storage services