Meta shifts to AI inference with its future chips

Meta has detailed four successive generations of its homegrown MTIA AI chip, developed in partnership with Broadcom. The MTIA 300, 400, 450, and 500 were produced in under two years, with chips either already in production or scheduled for data center deployment in 2026 and 2027. The focus shifts from workloads focused on content ranking toward GenAI inference.

Meta has unveiled details of four successive generations of its in-house AI chip, the Meta Training and Inference Accelerator (MTIA), developed in partnership with Broadcom. Four generations, MTIA 300, 400, 450, and 500, have been produced within less than two years, with several already in production and others scheduled for mass deployment in 2026 and 2027.

The quick pace is deliberate. Rather than betting on a single chip generation and waiting years for results, Meta has adopted a roughly six-month cadence per generation, using modular chiplet architecture to enable incremental upgrades without replacing entire rack systems. MTIA is not Meta’s only silicon play: in February, Meta announced a deal with AMD for 6 gigawatts of Instinct GPUs. It is also heavy reliant on Nvidia’s hardware.

From ranking and recommendation to GenAI

MTIA 300 was built for Meta’s ranking and recommendation (R&R) workloads and is currently in production for R&R training. As generative AI grew, MTIA 300 evolved into MTIA 400, featuring a 72-accelerator scale-up domain and 400% higher FP8 FLOPS over its predecessor. Meta says MTIA 400 has finished lab testing and is on the path to data center deployment.

MTIA 450 targets GenAI inference specifically, doubling HBM bandwidth over MTIA 400, exceeding leading commercial products, Meta claims, and delivering 6x the MX4 FLOPS of FP16/BF16. Mass deployment is scheduled for early 2027. MTIA 500 then adds a further 50% HBM bandwidth increase, up to 80% more HBM capacity, and 43% higher MX4 FLOPS over MTIA 450. From MTIA 300 to 500, HBM bandwidth grows 4.5x and compute FLOPS by 25x.

Meta first presented MTIA at ISCA 2023 and has since deployed hundreds of thousands of chips in production.

Inference-first, PyTorch-native

Meta’s strategy rests on three pillars: high-velocity development, an inference-first focus, and frictionless adoption. Where mainstream GPUs are built primarily for large-scale pre-training, MTIA 450 and 500 are optimized first for inference. The software stack takes a PyTorch-native approach, integrating with vLLM and Triton so that developers can use torch.compile and torch.export without MTIA-specific rewrites. MTIA 400, 450, and 500 all share the same chassis, rack, and network infrastructure, allowing each new generation to slot into the same physical footprint.

MTIA 450 and 500 are both scheduled for mass deployment in 2027.

Cisco makes NetOps and SecOps talk the same language

Embedded Splunk ITSI in Cisco Nexus Dashboard

Erik van Klinken 19 hours ago

Top story

SUSE in the shop window: will the Linux player remain European?

Originally German, SUSE is now owned by Swedish investment company EQT and officially based in Luxembourg. A ...

Erik van Klinken 2 hours ago

Everpure brings ActiveCluster to file environments

Everpure is expanding its Enterprise Data Cloud platform with support for ActiveCluster for file environments...

Mels Dees 2 hours ago

Top story

Oracle: sovereignty is a matter of trust, not just technology

The AI rollout will grind to a halt if it finds no trust among its prospective adopters. Oracle, too, has not...

Erik van Klinken 5 hours ago

Expert Talks

Tech calendar

Meta shifts to AI inference with its future chips

From ranking and recommendation to GenAI

Inference-first, PyTorch-native

Stay tuned, subscribe!

Claude Code gets tool for checking code

AI agents are the perfect insider

51 AI agents book your next trip to Australia

The future of generative AI in software testing

AFX is NetApp's data platform of the future with integrated AI data prep

Why vulnerability counting fails: a new approach to risk ops

Salesforce reveals its own Agentic IT Service Platform

Cisco reimagines network ops with agentic AI

The Zero-Drift Frontier: Modern Edge Demands on Kubernetes

When is an SBOM not an SBOM? CISA’s Minimum Elements

Sovereign: the new normal for AI and cloud native (and how to make it work)

De IT Afdeling van de toekomst

GITEX ASIA 2026

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

SAS Innovate 2026

Team '26

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices