4 min Applications

Google launches Gemini 3.1 Pro, an LLM for complex reasoning

Google launches Gemini 3.1 Pro, an LLM for complex reasoning

This month, Anthropic already unveiled the Opus and Sonnet versions of Claude 4.6. It beat Google’s Gemini 3 Pro on several fronts. A response was not long in coming, with the new, more powerful Gemini 3.1 Pro.

Google is rolling out 3.1 Pro starting today. It will appear in the Gemini app, NotebookLM for paying users, and in preview via the Gemini API. Although Google refers to it as an update to 3 Pro, the improvements are significant. At an earlier stage of GenAI, this would have been given a completely new number, with significantly stronger reasoning capabilities. This is evident from the ARC-AGI-2 benchmark, in which 3 Pro scores 31.1 percent and 3.1 Pro surpasses that by a factor of two: 77.1 percent.

Almost always better

The release of the Llama 4 model in April last year already showed that benchmarks don’t tell the whole story. At the time, Meta seemed competitive with Google, OpenAI, and Anthropic, but actual users were generally disappointed. We have not yet seen such a contrast between theory and practice from Google. That is why the scores of 3.1 Pro, which are generally slightly better than those of 3 Pro, seem to represent a meaningful upgrade on several fronts. It is striking that multimodal understanding and reasoning, tested via MMMU Pro, scored lower than before: 80.5 percent compared to 81.0 percent. The answers to this test were kept secret until a week ago. It is the only gain of 3 Pro compared to 3.1 Pro, so the result is probably not all that significant.

More consistent is the challenge of Claude, where Opus 4.6 encodes practically as well as 3.1 Pro. The SWE-Bench Verified for agentic coding is 80.6 percent for 3.1 Pro and 80.8 percent for Opus 4.6. This Claude model also finds it easier to use tools than the latest Gemini model, while expert tasks are best performed by Sonnet 4.6. The message is clear: don’t expect Anthropic to be overly concerned about Google’s latest model, even though Gemini 3.1 Pro is a much better agentic model than its predecessor.

Multimodal and practical

The improved intelligence of 3.1 Pro must translate into practical applications. Google cites examples such as visual explanations of complex topics, merging data into an overview, or bringing creative projects to life. One concrete feature is code-based animation: 3.1 Pro generates animated SVGs directly from a text prompt. These remain clear at any scale and have smaller file sizes than traditional video. Anyone looking at the results can clearly see an improvement (albeit subjective).

Vergelijking tussen Gemini 3 Pro en Gemini 3.1 Pro met isometrische apparaatillustraties; 3.1 Pro genereert SVG-gebaseerde animaties, terwijl 3 Pro een statisch beeld toont.
Source: Google

Since the release of Gemini 3 Pro in November, user feedback has led to these rapid improvements. Google is now rolling out 3.1 Pro in preview to validate updates. The company wants to continue developing in the area of agentic workflows before the model becomes generally available. Normally, the benchmarks in the announcements do not change, but it will be interesting to see if tool usage improves with the actual release.

Rollout and access

Starting today, Gemini 3.1 Pro is available to various audiences. Developers get preview access via the Gemini API in Google AI Studio, Gemini CLI, Google Antigravity, and Android Studio. Businesses can use the model via Vertex AI and Gemini Enterprise. For consumers, it will roll out in the Gemini app and NotebookLM. Users of Google AI Pro and Ultra plans will receive higher limits for 3.1 Pro in the Gemini app. NotebookLM will be accessible exclusively to Pro and Ultra users. The broad rollout follows the course Google set at the end of 2024 with the Gemini 3 series. The company sees feedback as an important means of enabling rapid iterations on its AI models.