Meta presents Llama API for developers

During the first-ever LlamaCon, Meta made several announcements and presented new tools to make Llama models more accessible to developers. The most important announcement was the introduction of the Llama API, which is now available as a limited free preview for developers.

With the Llama API, developers can try out various Llama models, including the recently launched Llama 4 Scout and Llama 4 Maverick. The API offers easy creation of API keys and lightweight TypeScript and Python SDKs. To make the transition from OpenAI-based applications easier, the Llama API is compatible with the OpenAI SDK.

Significant acceleration thanks to partnerships

Meta is collaborating with Cerebras and Groq to achieve higher inference speeds for the Llama API. Cerebras claims that their Llama 4 model within the API can generate tokens up to eighteen times faster than traditional GPU solutions from NVIDIA and others. According to benchmarks from the Artificial Analysis website, Cerebras’ solution achieved over 2,600 tokens per second for Llama 4 Scout, while ChatGPT remained around 130 tokens per second and DeepSeek achieved around 25 tokens per second.

Andrew Feldman, CEO and co-founder of Cerebras, said that Cerebras is proud to make the Llama API the fastest inference API in the world. According to him, developers building real-time and agent-based applications need speed above all else. Thanks to Cerebras, they can develop AI systems that remain out of reach for traditional GPU-based solutions.

Developers interested in this extremely fast Llama 4 inference can select Cerebras as a model option within the Llama API. Llama 4 Scout is also available through Groq, but runs at about 460 tokens per second — about six times slower than Cerebras, but still four times faster than other GPU-based solutions.

Llama Defenders Program

During LlamaCon, Meta also presented new Llama Protection Tools and announced the Llama Defenders Program, which gives selected partners access to AI-driven tools that enable them to evaluate the security of their systems and protect themselves against potential threats.

Whitepapers

Enhance your data protection strategy for 2025

The Data Protection Guide 2025 explores the essential strategies and...

Tech calendar

IT Arena

September 26, 2025 Lviv, Ukraine

Meta presents Llama API for developers

Significant acceleration thanks to partnerships

Llama Defenders Program

Stay tuned, subscribe!

Goodbye AIOps, hello AgenticOps: what is it and what can you do with it?

Snowflake AI has simplicity, efficiency, and trust at its core

SAP CEO calls European plan for own cloud data centers completely crazy

Google Cloud problem causes global outage

EUVD security database is Europe’s next step towards autonomy

Dutch government starts consultation for NIS2 bill

NIS2 leads to better basic hygiene

Don’t wait for NIS2 legislation, organizations can do a lot now

How to choose the right Enterprise Linux platform?

Try the latest high-end Synology backup system for free

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices

Kaseya DattoCon Europe

Nutanix Cloud Day Nederland 2025

Akamai Customer Day Benelux

Nürnberg Digital Festival 2025

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena