3 min

CAST AI specializes in optimizing cloud costs, specifically around Kubernetes. Today it is coming out with new features within the platform that it uses to do this. It is now also going to reduce the costs associated with training AI models. This should make it more realistic for more organizations to get serious about generative AI.

Generative AI is very interesting and is getting a lot of attention (including here on Techzine). However, we have also noted several times, including in our podcast episodes (in Dutch, unfortunately) on ChatGPT and GPT-4, that the price tag attached to it is very hefty. For ChatGPT, there is an estimate that it costs no less than $700,000 a day to keep it on the air. That might be considered as affordable for an organization like OpenAI, which has solid investments from Microsoft and others behind it. For an “ordinary” enterprise organization, it is not. Not to mention smaller organizations. So to make it more broadly interesting, those prices have to come down. Mind you, we are assuming here that generative AI is interesting for all organizations. We can argue about that, but that is not what this article is about.

Also read: Generative AI: How can GPT-4 shape the corporate world?

CAST AI platform gets new features

CAST AI has been concerned with optimizing costs since its inception in 2019. These include cloud costs and especially those related to Kubernetes. Those can add up pretty quickly due to the volatile nature of containers.

Today, however, it’s not specifically about Kubernetes, but about training AI models. That also generally takes place in the cloud, be it AWS, Google Cloud Platform or Microsoft Azure. The updates that CAST AI is officially adding to the platform today should ensure that optimization takes place in this area as well. The platform automatically goes through the three major clouds and looks for the most cost-effective GPUs. It selects these and also does the provisioning. If a GPU instance is no longer needed, the platform decommissions it. It can also replace a previously selected GPU instance with a cheaper one.

Also read: The jobs most threatened by generative AI like ChatGPT

Furthermore, we see some updates specifically aimed at the AWS cloud. For example, the CAST AI platform optimizes the deployment of Amazon Inferentia machines that you use to run AI models. In addition, it can also deploy Graviton processors while balancing things like performance and cost, CAST AI promises. Finally, the CAST AI platform takes care of managing spot instances. We are increasingly seeing the latter when deploying Kubernetes across multiple clouds. The platform selects the optimal configuration for the requirements of a specific model and finds the most cost-effective machines to match.

How much savings?

CAST AI claims it can generally cut the cloud bill for customers in half. That, then, is perhaps what to expect on average when it comes specifically to training AI models. This will no doubt depend on the availability of the GPUs to be used for it. We can’t quite estimate how much leeway there is in this area, spread across AWS, GCP and Azure. CAST AI anecdotally proves that the savings can be significant. It references a customer who realized a 76 percent savings when training AI models within Amazon EKS. So it seems there is quite a bit of wiggle room. In any case, it’s something to look at if you want to get started training AI models.

TIP: Sapphire 2023: SAP hits big with AI, what does it mean for business processes?