4 min Applications

Better, faster, more affordable: Google Cloud significantly expands Vertex AI portfolio

Better, faster, more affordable: Google Cloud significantly expands Vertex AI portfolio

Google Cloud is releasing extensive updates to its Vertex AI portfolio, aimed at improving existing models and introducing new capabilities. The most notable update is the general availability of Gemini 1.5 Pro, an advanced LLM with a context window of 2 million tokens. In addition, text-to-image generator Imagen 3 becomes available in preview.

Gemini 1.5 Pro has a context window of 2 million tokens, making it suitable for processing large amounts of data simultaneously, whether text, images or video. Think huge code libraries, long videos or the overturned bookcase of a decent research library.

The simpler variant Gemini 1.5 Flash is also becoming generally available. Promising low latency at an attractive cost, this model offers a context window of 1 million tokens and, according to Google, is especially suited to practical tasks such as chatbots in retail environments, document analysis, and data scanning in modest repositories.

According to Google, Gemini 1.5 Flash has a 60 times larger context window than competitor GPT 3.5 Turbo, a similar model. It is also said to be 40 per cent faster than its OpenAI rival at an input of 10,000 characters and four times cheaper in terms of input.

More effective grounding and third-party input

Grounding based on company-supplied context was already possible, but can now also be done with specific information from well-known knowledge providers, such as credit rating agency Moody’s, stock index MSCI, Thomson Reuters, and Zoominfo.

On the other hand, those who want to ground Gemini based on general, current information can use the integration with Google Search. That allows specific, up-to-date info to be fed to a model based on results via the company’s famous search engine. Moreover, through the new High-Fidelity grounding mode, users can increase accuracy in such operations. This last addition is still an experimental preview.

Google Cloud wants to give Vertex AI users freedom of choice. Those who prefer to use Anthropic’s Claude 3.5 Sonnet model rather than Google models but still want to stay in the Vertex AI environment can rejoice because this model is now compatible with it. Later this summer, the Mistral AI models Mistral Small, Mistral Large and Mistral Codestral will come to Vertex AI.

Little brother grows big

Another major upgrade involves Gemma 2, Gemini’s open-source little brother aimed at developers. The new version will soon become available in versions with 9 billion (9B) and 27 billion (27B) parameters. Google said this scalability means the model can handle more complex tasks and provide detailed as well as more accurate results.

Google is also releasing Context Caching for Gemini 1.5 Pro and Flash, which allows frequently used contexts to be cached. This should reduce data entry costs and provide faster answers for high-volume contexts. In short, it will handle larger amounts of data efficiently and also save money.

Tip: Google hints at breakthrough for LLMs with Infini-attention

Yet another way to keep the ever-increasing costs of AI workloads manageable involves Provisioned Throughput. This is a way to crank open or close the workload tap at will, depending on the need. This is primarily a cost-saving technique, available within the Google environment only for the company’s first-party models.

Advanced image generator

Definitely worth mentioning in Google’s deluge of announcements: image generator Imagen 3 will become available in preview. We’ve written about this feature before. Imagen 3 is a text-to-image model that generates photorealistic images based on detailed prompts. In doing so, it is also possible to generate images that look like sketches, cartoons, pixel art or clay animation.

Google specifically mentions the possibility of using Imagen 3 to put text in images, such as inscriptions on buildings or prints on clothing. This is where AI tools used to go wrong, as they often did not know what to do with such text.

Read also: Google Cloud aims to quickly deliver on AI promises