4 min Applications

Cloudera jumps into Bed(rock) with AWS

Cloudera jumps into Bed(rock) with AWS

AI is developing. Not only are we developing Artificial Intelligence (AI) and the Machine Learning (ML) that drives it in new ways to create new AI ‘models’ and ‘engines’ to help us work and live better, we are also developing a new AI vocabulary with terms like foundation model, abductive reasoning, vector database, admissible heuristic, backpropagation and of course 2023’s favourite term of all – generative AI. Should this new nomenclature point to an increasingly complex world of AI deployments?

The Collins English Dictionary recently named ‘AI’ the most notable word of 2023 – and most of us would have been surprised if it hadn’t. By most accounts, AI doesn’t appear to be the passing fad that these lexicographers sometimes pick up on word of the year i.e. 2021’s most notable word of term was NFTs, or Non-Fungible-Tokens from the world of blockchain. While AI may not be a ‘permacrisis’ (an extended period of instability and insecurity – and the most notable word in 2022), many organizations are finding it difficult to build AI capabilities given the challenge of sourcing readily available talent and resources. Given this challenge, what help is on hand?

Amazon Bedrock

To help organizations build and scale generative AI applications, AWS recently announced the general availability of Amazon Bedrock. This is a fully managed service that offers a choice of foundation models, from the likes of AI21 Labs, Anthropic, Cohere, Meta, Stability AI, plus of course Amazon, all of which can be accessed via a single Application Programming Interface (API). Amazon claims Bedrock provides the broad set of capabilities organizations need to simplify development while maintaining privacy and security.

As AWS reminds us, “Foundation models are a form of generative artificial intelligence (generative AI). They generate output from one or more inputs (prompts) in the form of human language instructions. Models are based on complex neural networks including generative adversarial networks (GANs), transformers and variational encoders.”

Features of Amazon Bedrock include a text playground for ‘hands-on’ text generation application in the AWS Management Console; an image playground, which takes the form of an image generation application in the console; a chat function (also called a playground) for conversation generation, clearly; an examples library of use cases; the Amazon Bedrock API, which cloud engineers can explore with the AWS CLI, or use the API to access the base models.; embeddings so that developers can use the API to generate embeddings from the Titan Embeddings G1 – Text model; and options for provisioned throughput to run inference on models.

For additional context here, Amazon Titan Embeddings is a text embeddings model that converts natural language text including single words, phrases, or large documents, into numerical representations that can be used to run use cases such as search, personalisation & clustering based on semantic similarity.

Jumping into bed

Data management specialist Cloudera is one of the first companies to jump into bed (pun intended, apologies) with Amazon Bedrock and build generative AI applications using these new features. Built on Amazon Bedrock, Cloudera is releasing a Text Summarization Applied ML Prototype (or Text Summarization AMP for short), which will enable customers to use different foundation models. It’s all accessible via a single API to summarise data managed in both Cloudera Public Cloud on AWS (obviously) and Cloudera Private Cloud on-premise.

What does this mean in practice? 

According to Cloudera, it will enable organizations to distil lengthy documents and articles into concise and coherent summaries, facilitating quick decision-making and enhanced productivity (corer AI requirement checked off there then!) and so, in practical terms, it will streamline these organization’s data analysis processes.

Data practitioner productivity

In the words of Cloudera and Amazon, “We couldn’t be more excited about building generative AI capabilities into Cloudera Data Platform (CDP) to power data practitioner productivity.”  Certainly, once the ‘excitement’ had died down there are some interesting developments on the horizon. 

Cloudera is already developing a SQL code AI assistant powered by Amazon Bedrock, which is promised to enable data analysts to generate and edit SQL queries using natural language statements. It also aims to optimise SQL queries to make them run more efficiently; explain what a SQL query is doing in plain English; and automatically find and fix errors in queries that won’t run. Cloudera claims this single tool will revolutionise how analysts get work done allowing them to spend more time on creating so-called ‘business value and less time on writing code.

Free image use: Wikimedia Commons