Who will develop the OS for AI? VAST Data is going for it

Deeper integration means more easy buttons

Who will develop the OS for AI? VAST Data is going for it

VAST Data is developing rapidly. The trend: it incorporates more and more components that play a role in data pipelines into its own platform. The preliminary goal: to be the OS, or operating system, for AI. What does VAST Data mean by this? And how does this benefit (enterprise) customers? We investigated.

“It feels like it’s getting closer and closer; I can really see it manifesting itself now,” says Field CTO Andy Pernsteiner about the development of the OS for AI by VAST Data. He knows very well where the company comes from and what it has developed, because he has been working for it for about eight years. That is, before VAST even had a product to bring to market.

To understand where VAST Data stands today, we need to look back at how the company built its platform from the ground up, from pure storage for High Performance Computing (HPC) to data management, and ultimately to a complete ‘AI OS’.

Read also: VAST Data further expands Data Platform with InsightEngine

HPC as a starting point

In the early days, VAST Data’s focus was primarily on storing enormous amounts of data. “Even before we talked about AI, data had to be stored somewhere,” Pernsteiner notes. The company started out in the world of HPC (High Performance Computing). The choice of this sector was strategic: in that world, the scale and performance requirements are enormous. With this choice, VAST more or less forced itself to set the bar very high. According to him, the skill set required to build what VAST has built is not something you learn in school, implying that the company has built something unique that certainly cannot be replicated easily.

VAST Data saw an opportunity to approach HPC differently, we hear from Pernsteiner. Traditional HPC systems were fast, but historically not always reliable or feature-rich. VAST opted for an approach that focused on stability, built on a foundation of flash storage. Pernsteiner explains that VAST designed the architecture from the outset to treat hardware as if it could fail at any moment.

VAST’s Disaggregated, Shared-Everything (DASE) architecture means that compute and storage (and state) exist separate from each other. The result is a platform that is not only cost-efficient, but above all extremely stable. Pernsteiner indicates that stability was the main reason for xAI to choose VAST. VAST’s system was the only one that was stable enough. “CSPs (Cloud Service Providers) nowadays calculate how many GPU minutes they have lost,” says Pernsteiner. In the AI era, stability is directly linked to return on investment. “We may not always be the fastest, but we are very reliable,” he adds, emphasizing once again what matters most when it comes to AI workloads.

Enterprises expect more from HPC

VAST Data’s initial focus on HPC was definitely a good one, we can now say. However, customers soon came up with requirements that went beyond pure speed and stability. They wanted enterprise features such as security, encryption, and multi-tenancy in VAST’s systems. They had to be able to the cut the platform into multiple pieces while maintaining strict monitoring and security. Once VAST had incorporated this, it effectively left the ‘old-fashioned’ HPC world and took steps towards AI applications.

From the outset, VAST wanted to achieve much more than just offering customers systems where they store data that no one then does anything with, we hear from Pernsteiner. It began to recognize patterns in how customers used their systems. They saw that modern analytics, such as those built using a Delta Lake framework on top of Parquet files, were inefficient. That’s why VAST decided to build its own database, the VAST Database. This database is very suitable for large-scale deployment, Pernsteiner says. It also meant that VAST could and did do much more with structured data. Furthermore, it also became a catalog full of metadata, providing a complete index of what is on a system.

Bringing compute to the data

The next step in the evolution of VAST was to reduce the distance between the data and the compute. Pernsteiner describes this as searching for a “shorter route to the data.” By embedding things like the Spark engine into the VAST platform itself, it delivered more intrinsic value as a whole. The idea was that if you don’t need CPU cycles or DRAM to move data, the platform gains efficiency.

VAST Data also integrated an event broker into the VAST Data Platform. This was necessary because the company needed to take another step in what Pernsteiner calls event-driven architectures. When data lands on the platform, the event broker springs into action. This is a real-time streaming engine that is compatible with Apache Kafka. VAST’s idea was to use this integration to remove a lot of complexity from the data pipeline. This is because it is no longer necessary to setup separate Kafka clusters, external brokers, or additional management layers. In addition, this offered the necessary possibilities in the area of automating flows on the Data Platform.

VAST’s ambition to “own the end-to-end data pipeline,” as Pernsteiner calls it, gained strength by the development of a global namespace. In a reality where data is stored in multiple locations, VAST saw that the standards for on-prem (Cloudera) and cloud (Databricks) had their limits. Moving data involves trade-offs for both environments, such as additional costs. By using a global namespace, which VAST has named Global Access, the company wants to offer a smarter way to manage data across different locations around the world.

Easy buttons for enterprise AI

With the rise of AI factories and the shift from model training to inference (the application of models), VAST saw a new challenge. Enterprises want to deploy AI, but often lack the expertise or infrastructure to build complex systems such as those for Retrieval-Augmented Generation (RAG). They need easy buttons, according to Pernsteiner.

VAST Data announced one of those easy buttons at the end of 2024 with InsightEngine. This is an application workflow that runs on the VAST Data Platform. This workflow focuses on a specific task: fetching and processing all enterprise data in real time. The VAST Data InsightEngine therefore makes it possible to perform real-time Retrieval-Augmented Generation (RAG). With RAG, you always have external data sources that you use to gain insights. If these are unstructured data sources, they must first be converted. With InsightEngine, this can all be done within the VAST Data platform. 

Another important part of the VAST platform is the native vector database. This is the type of database needed to run AI workloads. Here too, VAST has added its own twist, we hear from Pernsteiner. “Most vector implementations are optimized for speed but limited in scale,” he says. He also sees challenges in the area of security when it comes to vector databases.

This is where VAST Data’s end-to-end control over the data pipeline comes into play again. This control makes it possible to apply policies and Access Control Lists (ACLs) that apply to the underlying objects directly to the vectors. This means that a user can never access a semantic search if they do not have rights to the source document. This also immediately solves the problem of data silos in enterprises. In addition, VAST has built data reduction deep into the platform, Pernsteiner points out. This reduces the bloat caused by data vectorization.

In addition to InsightEngine, VAST Data recently added another engine to its offering. AgentEngine can be seen as a runtime for rolling out AI Agents and sits one layer higher in the VAST platform than InsightEngine. It allows you to open up any part of the VAST platform for MCP. AgentEngine was therefore developed to make the pipelines we regularly refer to in this article part of the platform. For VAST, it is an important part of the development of an OS for AI.

Even when it comes to agentic AI, where agents can potentially make autonomous decisions based on input from multiple models, it is important that VAST has focused on gaining complete control of the data pipeline. AgentEngine essentially turns a data pipeline into an MCP tool. This also includes the end-to-end control we discussed in the previous paragraph. Ultimately, this too must function as an easy button. However, Pernsteiner also admits that it is not easy to standardize in this area. Nividia plays an important role here, but especially on-prem, there are all kinds of enterprise wishes and requirements in the area of compliance that must be taken into account.

VAST wants to move closer to the GPU layer

However, the evolution of VAST Data does not stop at software abstractions. The underlying infrastructure is also part of the company’s broader vision. “The next phase is that we are moving closer and closer to the GPU layer,” says Pernsteiner.

VAST Data is essentially no longer a hardware supplier. On its own website, it states that VAST is sold as software but delivered and supported as an appliance. It does this through partnerships with other manufacturers. The partnership with Cisco is well known, but there are many more. A few years ago, for example, HPE launched new Alletra MP file storage that used VAST Data technology under the hood. And there are many more partners.

For VAST, the above construction means that it can continue to focus on what it does best, while taking advantage of the scale of players such as Cisco and HPE. However, there is still room for improvement. That is why VAST wants to move closer to the GPU layer. “We have started adding more options to the platform, allowing GPUs to be managed from within the system,” says Pernsteiner. He believes this is important because it allows even more to be gotten out of the GPUs. It should create even more abstraction, making it even easier for customers to set up data pipelines without having to worry about underlying issues.

Does VAST Data provide the OS for AI?

By strengthening the ties between the VAST Data system/platform and the underlying GPUs, VAST’s ambition to become an OS for AI is at least a step closer. Whether what the company is developing is or can become the OS for AI is still very much open to question. That depends on how specific you make the definition. There are quite a few layers within the infrastructure where something similar is being worked on.

We will undoubtedly see a huge stratification of OSs for AI, just as we have seen with platforms and fabrics. However, if we look at the data pipelines and platforms needed to deploy AI, we believe that VAST Data definitely has a point. The ever-further and deeper integration of the various components of the data pipeline, and now also the move towards the GPU layer, certainly underline this ambition.

VAST’s approach is, of course, not only a potentially beneficial development for customers, who are increasingly being given easy buttons to deploy AI. VAST itself will also benefit if it can take control of end-to-end operations and maintain that control. There is nothing wrong with that, as long as the balance between what the customer gets out of it and what VAST gets out of it is good.