2 min

Tags in this article

, , ,

To make the most of AI, companies need to leverage their own data. It makes LLMs more accurate and therefore more useful. But how do you ensure this? DataStax and Nvidia want to help with that.

DataStax offers Astra DB as a NoSQL database. These types of databases are aimed at large volumes of data and low latency, things that suit AI workloads. However, (re)training AI on new data is time-consuming. For this reason, DataStax is turning to RAG: Retrieval-Augmented Generation. With this technique, new data sources can be directly connected to AI models. On top of that, RAG reduces the number of hallucinations of an LLM, or generation of erroneous information.

Tip: What is Retrieval-Augmented Generation?

Nvidia collaboration

Astra DB already offers vector embeddings, but DataStax is improving its deployment. This “AI multitool,” as Google has called vector embeddings, offers the ability to convert data into data points. These can then be processed faster by AI models. With the help of RAG, these databases can count as continuously updated information banks that can be used for AI workloads.

DataStax uses Nvidia’s latest offering to generate RAG vector embeddings faster. This week, Nvidia has begun offering new microservices to connect to inferencing and retrieval services. The end result provides a twenty-fold acceleration for the number of embeddings per second. Databases can therefore be refreshed much faster.

Nvidia can generate 800 embeddings per second per GPU, while DataStax’s Astra DB supports more than 4,000 transactions per second. The latency of these processes is less than 10 milliseconds, DataStax promises.


Compared to other cloud-based embedding services, DataStax claims an 80 percent price reduction for customers. According to Nvidia’s VP of AI Software Kari Briski, this fits with the desire to connect much more unstructured data to GenAI applications. “Using the integration of Nvidia NIM and NeMo Retriever microservices with the DataStax Astra DB, businesses can significantly reduce latency and harness the full power of AI-driven data solutions.”

Reading tip: Oracle Database@Azure now available in Europe