Google Interactions API makes AI agents faster and more efficient

Last week, Google DeepMind made the Interactions API available as a public beta. The new API represents a fundamental change in how developers work with AI models: from stateless to a stateful architecture with server-side context management. With this move, Google is following the path that OpenAI embarked on in March 2025 with its Responses API.

Over the past two years, developers have been working with generative AI via a so-called ‘completion’ model: you send a prompt, you get an answer, and that’s the end of the transaction. For follow-up questions, the entire conversation history has to be sent along each time. Which is why it is stateless. However, this stateless architecture does not work well for complex AI agents that need to use various tools, keep track of extensive context, and engage extended thinking to come up with the best solution.

The new Interactions API aims to solve this by supporting server-side state. Developers no longer need to manage and send the entire conversation history. Instead, they send a so-called previous_interaction_id. Based on that previous_interaction_id, Google can retrieve the conversation history from its servers. This includes previous results from tools and AI models.

Background execution for long-running tasks

Another major change is background execution. As soon as you start working with complex workflows that take more than a few minutes to complete, you can encounter timeout errors. A standard web request can often only take 60 to 600 seconds, depending on the web server configuration. If you have a process or agent that has to search through many web pages or analyze reports, you will quickly encounter HTTP timeouts.

The Interactions API allows developers to start an agent with background=true. This immediately disconnects the connection and allows the result to be retrieved later.

“Models are becoming systems and over time, might even become agents themselves,” wrote DeepMind’s Ali Çevik and Philipp Schmid in the official blog post. “Trying to force these capabilities into generateContent would have resulted in an overly complex and fragile API.”

Google versus OpenAI: transparency or efficiency?

Google is choosing a similar path to OpenAI, but with its own twist. OpenAI already moved away from stateless in March 2025 with its Responses API. Both are moving away from stateless to make context more available, but the chosen route is quite different.

OpenAI’s Responses API introduced Compaction, a feature that compresses conversation history. This focuses purely on the output, and removes tool outputs and reasoning chains. This improves token efficiency but creates a black box that hides the model’s previous reasoning.

Google’s Interactions API, on the other hand, keeps the entire history available and composable. The data model allows developers to debug, manipulate, stream, and reason about messages. Google prioritizes transparency and full searchability over compression.

Native MCP support and available models

Google also embraces the open ecosystem by providing native support for Model Context Protocol (MCP). This allows Gemini models to directly invoke external tools hosted on remote servers without developers having to write code for this. In this way, all kinds of information can be quickly retrieved, and context can be improved.

The Interactions API is now available in public beta via Google AI Studio. It supports the full spectrum of Google’s latest generation of models: Gemini 3.0 (Gemini 3 Pro Preview), Gemini 2.5 (Flash, Flash-lite, and Pro), and the Deep Research Preview agent.

The pricing structure remains the same, with the standard rates for input and output tokens of the models applying. It does depend on how long the history of interactions is to be retained. The free tier has a retention period of 1 day, while the paid tier keeps the interactions available for 55 days.

Meta considers Gemini license after disappointment with its own AI

Meta is postponing the launch of its new AI model Avocado until at least May. The model performs better inter...

Berry Zwets March 13, 2026

Top story

51 AI agents book your next trip to Australia

How TravelEssence is expanding its travel agency with OutSystems

Laura Herijgers March 5, 2026

ThoughtSpot launches Spotter Semantics for AI agents

ThoughtSpot introduces Spotter Semantics, a semantic layer designed to deliver consistent, reliable insights ...

Berry Zwets March 12, 2026

Top story

“Blind AI deployment leads to knowledge loss and software failures”

Artificial intelligence is rapidly being integrated into business processes, driven by promises of efficiency...

Berry Zwets March 9, 2026

Expert Talks

Tech calendar

Google Interactions API makes AI agents faster and more efficient

Background execution for long-running tasks

Google versus OpenAI: transparency or efficiency?

Native MCP support and available models

Stay tuned, subscribe!

Cisco and Nvidia lower barrier to secure, full-stack AI infrastructure

The RAMpocalypse is a warning for stricter performance KPIs

AI agents are the perfect insider

Oracle: sovereignty is a matter of trust, not just technology

Why Salesforce built three levels of AI commerce agents

What makes Salesforce agents reliable? Architecture explained

Why SAP says best-of-breed software era is over

Salesforce reveals its own Agentic IT Service Platform

The Zero-Drift Frontier: Modern Edge Demands on Kubernetes

When is an SBOM not an SBOM? CISA’s Minimum Elements

Sovereign: the new normal for AI and cloud native (and how to make it work)

De IT Afdeling van de toekomst

GITEX ASIA 2026

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

SAS Innovate 2026

Team '26

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices