3 min Applications

Nutanix makes AI deployment approachable with GPT-in-a-Box 2.0

Nutanix makes AI deployment approachable with GPT-in-a-Box 2.0

As attractive as AI may be, it tends to be inexorably expensive to deploy. Nutanix aims to cut costs significantly for customers by continuing to develop its GPT-in-a-Box solution.

We’ve written about GPT-in-a-Box before. In version 1.0, it was already possible to securely connect advanced AI models with proprietary data. 2.0 is a lot more ambitious, with a strong focus on simplicity (as is often the case with Nutanix).

Training, fine-tuning, inferencing

A major stumbling block to AI deployment is that it is extremely compute-heavy. Developing a foundation model yourself is usually not feasible, but the training process is also something only a privileged subset of companies can perform. Nutanix cites an earlier claim by Nvidia CEO Jensen Huang that OpenAI’s GPT-4 required 15 megawatts (!) and was trained over 90 days. In short, this process is not feasible for most companies.

Fine-tuning as well as Retrieval Augmented Generation (RAG) are a bit less demanding. These two options (or a one-two punch) make a foundation model suitable for specific AI applications. For example, an LLM can be made suitable for customer service, banking or as a security assistant. However, those applying fine-tuning or RAG are also not cheap in the cloud. This can be done on-prem, but still requires considerable computing power.

Tip: Nutanix and Dell build multicloud solutions together

Nutanix sees inferencing as something it can help its customers with. Inferencing occurs when AI models are actually deployed on a daily basis. With Nutanix AI Inference Endpoint, the company offers a turnkey interface that enables management, access and auditing. It lets AI models run on-prem or in any cloud. On-prem is preferred by most parties because it is a lot cheaper than continuous API use on the cloud. In this way, Nutanix promises to offer the benefits, not the burdens, of AWS, Azure and GCP.

Inside Nutanix

The models to choose from will continually expand. Nutanix also knows that this is a fast-moving target: every few months, a new promising LLM arrives. That’s why Nutanix partners with Hugging Face and Nvidia’s NIM to access all kinds of models through a single API. This greatly simplifies the software architecture for AI and avoids repetitive work when another new LLM arrives.

Choosing an endpoint is just a few clicks away within Nutanix. Testing is simple, too, taking place within the same interface which was used to pick the endpoint.

A key component for AI deployment is the already existing Nutanix Data Lens. This SaaS solution enables data classification within Nutanix Files. In addition to data security and visibility, Data Lens allows unstructured data to be suitable for AI. On top of this, it ensures that sensitive data is not included, for example, so that a customer service chatbot cannot reproduce proprietary data.

And after that?

We asked Nutanix what’s next for GPT-in-a-Box – and the interpretation of AI in a general sense. We should expect any version 3.0 of this offering to simplify even further. Ultimately, this is not just up to Nutanix: with any data silos, the platform may not see the data it could otherwise classify. Therefore, it’s up to all vendors to make the practical deployment of AI a reality.

Also read: Nutanix scales up Project Beacon to cloud-native