3 min Applications

Microsoft makes its Phi-4 small language model open-source

Microsoft makes its Phi-4 small language model open-source

Microsoft has released the code for Phi-4, a small language model that can generate text and solve mathematical problems.

SiliconAngle writes about it. Microsoft first described the model last month. Initially, Phi-4 was only accessible through Microsoft’s Azure Foundry artificial intelligence development service. The model is now available for download on Hugging Face, a popular website for open-source AI projects.

14 billion parameters

Phi-4 is the fourth iteration of a series of small language models that Microsoft introduced in 2023. It features 14 billion parameters, the configuration settings determining how a neural network processes data. Microsoft researchers trained the model for 21 days on a cluster of 1,920 H100 graphics processors from Nvidia Corp.

The model is based on the Transformer architecture, an industry standard that most major language models support. When they receive a user prompt, Transformer models split the input into individual words and determine the meaning of each word by analyzing the surrounding text. They also prioritize the parts of the text considered most relevant.

Lower costs

Phi-4 implements a so-called decoder-only variant of the Transformer architecture. A standard Transformer model analyzes text before and after a word to determine meaning. Decoder-only models focus exclusively on the text preceding the word, which reduces the amount of data to process and thus lowers inference costs.

In a research article, Microsoft describes how it improved the output quality of Phi-4 using two post-training optimization techniques: direct preference optimization and supervision-driven fine-tuning. Both techniques provide a language model with examples that explain how it should respond to prompts.

In an internal evaluation, Microsoft compared Phi-4 to Llama 3.3 70B, a large language model with five times as many parameters. Microsoft said Phi-4 performed better on the popular GPQA and MATH benchmarks. These two test datasets contain science questions and math problems, respectively.

More and more language models open-source

Phi-4 joins the growing list of small language models open-sourced by major technology companies over the past year.

In February, Google introduced a set of small language models called Gemma. The algorithms in this series have between 2 billion and 27 billion parameters. According to Google, the version with 27 billion parameters can outperform models more than twice its size.

More recently, Meta Platforms released two Llama 3.2 models with less than five billion parameters. The company followed up this release by open-sourcing even more efficient versions of these models. These implement a machine learning technique called quantization, which compresses the data a neural network processes so that less hardware is needed for processing.