IBM introduces Granite 4 language models

IBM has released a new generation of open-source language models, Granite 4. The series combines two neural network architectures and is designed to deliver better performance with less memory usage.

At launch, the Granite 4 family consists of four models ranging in size from 3 to 32 billion parameters. According to IBM, they perform more efficiently than previous generations, thanks to a hybrid design that combines the Transformer architecture with Mamba, a new and hardware-efficient network structure.

One of the smaller models, Granite-4.0-Micro, uses only the Transformer approach. This is known for its attention mechanism, which allows the model to select and prioritize the most important parts of a text. The three other models add elements of the Mamba architecture to this. Mamba offers similar capabilities, but utilizes a mathematical system known as a state space model, originally developed for space applications.

Lower memory pressure

One advantage of Mamba is the lower memory pressure with long input prompts. Whereas the memory usage of a Transformer increases rapidly, with Mamba it remains limited. This makes the models cheaper and faster, which is particularly useful in real-time applications or on lighter hardware.

The Granite 4 series is built on the latest version of the Mamba architecture, Mamba 2. It is more compact and efficient, requiring less hardware for the same calculations. The largest model, Granite-4.0-H-Small, has 32 billion parameters and uses a mixture-of-experts design in which only a portion of the parameters are activated. IBM sees this as a suitable solution for automated customer support.

The two smaller hybrid models, Granite-4.0-H-Tiny and Granite-4.0-H-Micro, have 7 billion and 3 billion parameters, respectively. They are intended for applications where speed is more important than maximum accuracy.

According to IBM, Granite-4.0-H-Tiny consumes much less memory than its predecessor, Granite 3.3 8B. In internal tests, the model used only one-sixth of the RAM, while improving output quality. An IBM researcher stated that the efficiency of the new architecture only partly explains the progress; refined training methods and a larger training corpus contribute significantly to the improved performance.

Granite 4 is available through IBM’s watsonx.ai platform and through external services such as Hugging Face. IBM also plans to offer the models through Amazon SageMaker JumpStart and Microsoft Azure AI at a later date, and intends to expand with new variants featuring more advanced reasoning capabilities.

Anthropic launches Claude Opus 4.5, promising an AI breakthrough

Claude Opus 4.5 is the best model for coding tasks and agentic AI. At least, that's what Anthropic claims. Th...

Erik van Klinken 22 hours ago

Top story

Manhattan Associates goes all-in on the cloud: The end of on-premises is near

Manhattan Associates updated its supply chain software this year exclusively with AI enhancements. In doing s...

Laura Herijgers 1 day ago

Microsoft Foundry Agent Service offers a choice of more AI models

Microsoft has expanded Foundry Agent Service with a wide range of AI models. LLMs from Anthropic, DeepSeek AI...

Erik van Klinken November 21, 2025

Expert Talks

Tech calendar

IBM introduces Granite 4 language models

Lower memory pressure

Stay tuned, subscribe!

Cisco boosts phase-out of insecure legacy with new initiative

Earnings: Nvidia fuels AI hype, Lenovo rises, Palo Alto signs deal

Anthropic launches Claude Opus 4.5, promising an AI breakthrough

Managing the AI chaos with ServiceNow's AI Control Tower

Nutanix CTO explains their VMware alternative and multi-cloud strategy

ServiceNow goes after the mid-market with its AI-based Core Business Suite

SAP's AI migration tools from ECC to S/4HANA: faster and cheaper ERP transitions

How our team optimizes infrastructure for minimal AI video processing latency

Redefining the Software Development Lifecycle in the Age of AI

AI Integrity: The Invisible Threat Organizations Can’t Ignore

Three Ways Secure Modern Networks Unlock the True Power of AI

BrickCon The Databricks Community Conference

Appdevcon

Webdevcon

Dutch PHP Conference

GITEX ASIA 2026

SAS Innovate 2026

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices