2 min Devops

Docker Model Runner makes running LLM locally easier

Docker Model Runner makes running LLM locally easier

Docker is introducing Model Runner in beta for macOS on Apple silicon. This allows developers to easily work with Large Language Models (LLMs) locally.

Running LLMs locally is still a challenge for many developers. Between choosing the right model, dealing with hardware limitations and optimizing performance, developers can get stuck before they can even start building. Docker Model Runner aims to change this. The new tool, available by default in Docker Desktop 4.40, promises to make it possible to run AI models without a complex setup via an OpenAI-compatible API and offers GPU acceleration on Apple hardware.

What makes Model Runner special?

The initial beta version of Docker Model Runner offers an integrated engine on top of Llamas.cpp, accessible via an OpenAI-compatible API. Existing code that works with OpenAI’s API can easily be modified to run locally with Model Runner.

A significant advantage is the GPU acceleration on Apple silicon. By executing the inference engine directly as a host process, Model Runner can optimally use Apple’s hardware’s graphics capabilities. In addition, the models are packaged as standard OCI artifacts, making them easy to distribute and reuse via existing Container Registry infrastructure.

Getting started with Model Runner

Docker Model Runner is enabled by default in Docker Desktop 4.40 for macOS on Apple silicon. If you have disabled the function, you can reactivate it with one simple command: docker desktop enable model-runner.

By default, Model Runner is only accessible via the Docker socket on the host or via the special endpoint model-runner.docker.internal for containers. For those who want to use the function via TCP from a host process, for example, to connect an OpenAI SDK directly, Model Runner can also be enabled with a specific port: docker desktop enable model-runner –tcp 12434.

Docker Model Runner is still in beta, but it already promises a simplified way to experiment with and develop AI models locally.

Tip: Docker launches Docker Desktop for Linux and Docker Extensions