Better, faster, more affordable: Google Cloud significantly expands Vertex AI portfolio

Google Cloud is releasing extensive updates to its Vertex AI portfolio, aimed at improving existing models and introducing new capabilities. The most notable update is the general availability of Gemini 1.5 Pro, an advanced LLM with a context window of 2 million tokens. In addition, text-to-image generator Imagen 3 becomes available in preview.

Gemini 1.5 Pro has a context window of 2 million tokens, making it suitable for processing large amounts of data simultaneously, whether text, images or video. Think huge code libraries, long videos or the overturned bookcase of a decent research library.

The simpler variant Gemini 1.5 Flash is also becoming generally available. Promising low latency at an attractive cost, this model offers a context window of 1 million tokens and, according to Google, is especially suited to practical tasks such as chatbots in retail environments, document analysis, and data scanning in modest repositories.

According to Google, Gemini 1.5 Flash has a 60 times larger context window than competitor GPT 3.5 Turbo, a similar model. It is also said to be 40 per cent faster than its OpenAI rival at an input of 10,000 characters and four times cheaper in terms of input.

More effective grounding and third-party input

Grounding based on company-supplied context was already possible, but can now also be done with specific information from well-known knowledge providers, such as credit rating agency Moody’s, stock index MSCI, Thomson Reuters, and Zoominfo.

On the other hand, those who want to ground Gemini based on general, current information can use the integration with Google Search. That allows specific, up-to-date info to be fed to a model based on results via the company’s famous search engine. Moreover, through the new High-Fidelity grounding mode, users can increase accuracy in such operations. This last addition is still an experimental preview.

Google Cloud wants to give Vertex AI users freedom of choice. Those who prefer to use Anthropic’s Claude 3.5 Sonnet model rather than Google models but still want to stay in the Vertex AI environment can rejoice because this model is now compatible with it. Later this summer, the Mistral AI models Mistral Small, Mistral Large and Mistral Codestral will come to Vertex AI.

Little brother grows big

Another major upgrade involves Gemma 2, Gemini’s open-source little brother aimed at developers. The new version will soon become available in versions with 9 billion (9B) and 27 billion (27B) parameters. Google said this scalability means the model can handle more complex tasks and provide detailed as well as more accurate results.

Google is also releasing Context Caching for Gemini 1.5 Pro and Flash, which allows frequently used contexts to be cached. This should reduce data entry costs and provide faster answers for high-volume contexts. In short, it will handle larger amounts of data efficiently and also save money.

Tip: Google hints at breakthrough for LLMs with Infini-attention

Yet another way to keep the ever-increasing costs of AI workloads manageable involves Provisioned Throughput. This is a way to crank open or close the workload tap at will, depending on the need. This is primarily a cost-saving technique, available within the Google environment only for the company’s first-party models.

Advanced image generator

Definitely worth mentioning in Google’s deluge of announcements: image generator Imagen 3 will become available in preview. We’ve written about this feature before. Imagen 3 is a text-to-image model that generates photorealistic images based on detailed prompts. In doing so, it is also possible to generate images that look like sketches, cartoons, pixel art or clay animation.

Google specifically mentions the possibility of using Imagen 3 to put text in images, such as inscriptions on buildings or prints on clothing. This is where AI tools used to go wrong, as they often did not know what to do with such text.

Top story

What is HPE VME and is it a direct competitor to VMware’s hypervisor?

HPE hit the bullseye with VME

Sander Almekinders 12 hours ago

Whitepapers

Better, faster, more affordable: Google Cloud significantly expands Vertex AI portfolio

More effective grounding and third-party input

Little brother grows big

Advanced image generator

Stay tuned, subscribe!

Tech sector calls on EU to pause AI Act

AI only works if the infrastructure is right

Microsoft set to withdraw direct kernel access from security software

HPE can finally take over Juniper after settling with the US government

IFS acquires TheLoops: AI agents for critical industries

30% of Salesforce’s work is done by AI: what does that mean?

ServiceNow lays foundation for agentic AI with platform for business transformation

AI agents are the new apps: Salesforce leads the way with Agentforce 2.0

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices

The AI reality tour

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon