4 min

Tags in this article

, , , ,

Generative Artificial Intelligence (gen-AI) has had a busy year, obviously. The technology wires have been liberally peppered with analysis detailing the spiralling popularisation of gen-AI engines and the Large Language Models (LLMs) that feed and serve them. There has been equal interest and discussion related to the wider proliferation of vector databases with their ability to create ‘embeddings’ from an organization’s own foundational models to tune AI to a particular use case. But the AI toolkit goes deeper and Retrieval Augmented Generation (RAG) is now surfacing as a fundamental function in the gen-AI space too. But what is it, why does it matter… and who is doing it?

What is Retrieval Augmented Generation (RAG)?

Retrieval Augmented Generation (RAG) is an advanced AI technique that combines information retrieval with text generation, allowing AI models to retrieve relevant information from a knowledge source and incorporate it into generated text. 

In other words, RAG is a hybrid framework that integrates retrieval models and generative models to produce text that is not only contextually accurate but also information-rich. In the most simple terms, RAG is gen-AI that is capable of using an LLM as its knowledge base, but is also capable of querying some other ‘knowledge store’ (which could be the Internet or some other collection of data or documents) to ground itself in a wider and more established base of information.

Now tabling itself as a specialist in powering generative AI applications with real-time scalable data, DataStax has come forward with its own RAGStack service. Described as an out-of-the-box RAG solution designed to simplify the implementation of applications built with LangChain (see below), RAGStack aims to reduce the complexity and choices that developers face when implementing RAG for their generative AI applications. The company says it is a streamlined, tested and efficient set of tools and techniques for building with LLMs.

For completeness here, let’s remind ourselves that open source LangChain is a framework for software application development engineers focused on building LLM-driven software with Natural Language Processing (NLP) abilities. LangChain’s mission and objective is to link LLMs like OpenAI’s GPT offerings with external data sources. 

The RAG challenge 

As organizations now work to implement RAG, DataStax insists that the process of providing context from outside data sources to deliver more accurate LLM query responses into their generative AI applications can be tough. Why is this so? Because these organizations are left sifting through complex and overwhelming technology choices across open source orchestration frameworks, vector databases, LLMs and more. Currently, companies often need to fork and modify these open source projects for their needs. 

Pinpointing what the firm thinks is a hole in the market for an off-the-shelf commercially supported solution in this space, DataStax explains how its latest offering fills this space. With RAGStack, developers get a preselected set of the best open source software for implementing generative AI applications, providing a ready-made solution for RAG that leverages the LangChain ecosystem including LangServe, LangChain Templates and LangSmith, along with Apache Cassandra and the DataStax Astra DB vector database. 

“Every company building with generative AI right now is looking for answers about the most effective way to implement RAG within their applications,” said Harrison Chase, CEO, LangChain. “DataStax has recognized a pain point in the market and is working to remedy that problem with the release of RAGStack. Using top-choice technologies, like LangChain and Astra DB among others, Datastax is providing developers with a tested, reliable solution made to simplify working with LLMs.”

Out-of-the-box RAG

RAG combines the strengths of both retrieval-based and generative AI methods for Natural Language Understanding (NLU) and generation, enabling real-time, contextually relevant responses that underpin much of the innovation happening with this technology today.

“Out-of-the-box RAG solutions are in high demand because implementing RAG can be complex and overwhelming due to the multitude of choices in orchestration frameworks, vector databases, and LLMs,” said Davor Bonaci, CTO and executive vice president, DataStax. “It’s a crowded arena with few trusted, field-proven options, where demand is high, but supply is relatively low. RAGStack helps to solve this problem and marks a significant step forward in our commitment to providing advanced, user-friendly AI solutions to our customers.”

Also at work here is a technology known as data chunking. Before a retrieval model can search through the data, it’s typically divided into manageable ‘chunks’ or segments. 

Data chunking

DataStax explains this chunking process and says that it ensures that the system can efficiently scan through the data and enables quick retrieval of relevant content. “Effective chunking strategies can drastically improve the model’s speed and accuracy: a document may be its own chunk, but it could also be split up into chapters/sections, paragraphs, sentences, or even just ‘chunks of words’. Remember: the goal is to be able to feed the Generative Model with information that will enhance its generation,” notes DataStax, in a technical blog.

With specifically curated software components, abstractions said to improve developer productivity and system performance, enhancements that improve existing vector search techniques and compatibility with most generative AI data components, DataStax is upbeat about its new technology and promises that RAGStack provides overall improvements to the performance, scalability and cost of implementing RAG in gen-AI applications.

A generative AI toolbox

What’s happening here is a tooling-up process i.e. the generative AI toolbox is not just an LLM, not just a vector database, not just a collection of embeddings, not just a data chunking process, not just NLP know-how and not just privately-skewed data engineering to create some notion of so-called ‘private AI’ aligned to a company’s particular information needs and locked down with additional guardrails and security. It is all those things… and it’s the next thing too, which will be along soon for sure.

Free image use: Wikimedia Commons