Docker Model Runner makes running LLM locally easier

Docker is introducing Model Runner in beta for macOS on Apple silicon. This allows developers to easily work with Large Language Models (LLMs) locally.

Running LLMs locally is still a challenge for many developers. Between choosing the right model, dealing with hardware limitations and optimizing performance, developers can get stuck before they can even start building. Docker Model Runner aims to change this. The new tool, available by default in Docker Desktop 4.40, promises to make it possible to run AI models without a complex setup via an OpenAI-compatible API and offers GPU acceleration on Apple hardware.

What makes Model Runner special?

The initial beta version of Docker Model Runner offers an integrated engine on top of Llamas.cpp, accessible via an OpenAI-compatible API. Existing code that works with OpenAI’s API can easily be modified to run locally with Model Runner.

A significant advantage is the GPU acceleration on Apple silicon. By executing the inference engine directly as a host process, Model Runner can optimally use Apple’s hardware’s graphics capabilities. In addition, the models are packaged as standard OCI artifacts, making them easy to distribute and reuse via existing Container Registry infrastructure.

Getting started with Model Runner

Docker Model Runner is enabled by default in Docker Desktop 4.40 for macOS on Apple silicon. If you have disabled the function, you can reactivate it with one simple command: docker desktop enable model-runner.

By default, Model Runner is only accessible via the Docker socket on the host or via the special endpoint model-runner.docker.internal for containers. For those who want to use the function via TCP from a host process, for example, to connect an OpenAI SDK directly, Model Runner can also be enabled with a specific port: docker desktop enable model-runner –tcp 12434.

Docker Model Runner is still in beta, but it already promises a simplified way to experiment with and develop AI models locally.

Tip: Docker launches Docker Desktop for Linux and Docker Extensions

Google is generous with AI programmers

Erik van Klinken June 25, 2025

Top story

The inevitable startle response to vibe coding

AI code is growing, code review is not keeping pace

Erik van Klinken June 23, 2025

Tech calendar

Docker Model Runner makes running LLM locally easier

What makes Model Runner special?

Getting started with Model Runner

Stay tuned, subscribe!

Is private 5G finally delivering on its enterprise promises?

Tech sector calls on EU to pause AI Act

HPE can finally take over Juniper after settling with the US government

It’s World Backup Day, but backups alone are not enough

Pure’s FlashBlade//EXA should solve storage bottlenecks in AI and HPC

How do you build a secure Synology storage system?

The AI reality tour

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices