Google makes breakthrough in efficiency for AI agents

Google researchers, in collaboration with the University of California, Santa Barbara, have developed a new framework that helps AI agents use computing power and tools more efficiently.

This is according to a recent paper on arXiv, as reported by VentureBeat. The research focuses on a growing problem in agentic AI: how to scale the use of external tools without costs and latency becoming unmanageable.

Test-time scaling for AI agents is increasingly shifting from longer thinking to controlling tool calls. In many practical applications, such as web search and document analysis, the number of external actions determines how deep an agent can dig. Each tool call increases the context window, increases token consumption, and incurs additional API costs. For companies, this can quickly add up.

The researchers note that allocating more budget to an agent does not always lead to better performance. According to the authors, many agents lack any awareness of their available resources. They follow a single lead for too long, spend dozens of tool calls on a seemingly relevant direction, and only discover late in the process that it is a dead end. As a result, extra computing budget is consumed without any significant gain in quality.

As a first step, the researchers introduce Budget Tracker, a simple module that continuously informs the agent about the remaining budget. This approach works entirely at the prompt level and does not require retraining. The agent receives explicit signals about resource usage and can adjust its strategy accordingly. In Google’s implementation, the tracker also includes guidelines that indicate which behavior is appropriate for different budget levels.

Experiments with search agents that work according to a ReAct-like method show that this approach is effective. The paper shows that Budget Tracker can reduce the number of search calls by more than 40 percent and browse calls by almost 20 percent, while total costs fall by more than 30 percent. At the same time, performance continues to improve at higher budgets, where traditional agents tend to stall.

Verification and budget awareness in a single iterative process

In addition to this lightweight solution, the paper describes a more comprehensive framework called Budget Aware Test-time Scaling, or BATS. BATS combines planning, verification, and budget awareness in a single iterative process. The agent dynamically adjusts its behavior based on the remaining budget and decides whether to continue its investigation or change course.

Tests on benchmarks such as BrowseComp and HLE-Search, including Gemini 2.5 Pro as the underlying model, show that BATS achieves higher accuracy at lower cost than existing methods.

Google makes breakthrough in efficiency for AI agents

Verification and budget awareness in a single iterative process

Stay tuned, subscribe!

AI agents are the perfect insider

Cerebras partnership breathes new life into AWS Trainium

Cisco and Nvidia lower barrier to secure, full-stack AI infrastructure

The European data center market is a puzzle with an increasing number of pieces

IFS builds an industrial AI ecosystem through partnerships

Inside Cisco's AI-powered customer experience strategy

Why Salesforce built three levels of AI commerce agents

AI data centers: the road to 1 megawatt per rack explained

The Zero-Drift Frontier: Modern Edge Demands on Kubernetes

When is an SBOM not an SBOM? CISA’s Minimum Elements

Sovereign: the new normal for AI and cloud native (and how to make it work)

De IT Afdeling van de toekomst

GITEX ASIA 2026

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

SAS Innovate 2026

Team '26

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices