A compact AI model with outputs that rival those of LLMs from Meta and Google. That’s the promise Microsoft is making with its newly announced Phi-2, a so-called “Small Language Model” (SLM). What is Phi-2 for? And most importantly, what does this development say about the future of AI models?
The Microsoft team argues that Phi-2 is an “ideal playground for researchers,” in part because of its compact size. With only 2.7 billion parameters, it is considerably smaller than, say, any of Meta’s Llama 2 variants (7B, 13B, 70B). Nevertheless, according to the benchmarks presented, it performs equal to or better than models up to 25 times its size. In terms of reasoning, language comprehension, mathematical skills and coding, Phi-2 achieves results similar to its parameter-laden competitors.
A smaller model takes less computing power to run. Given that fact, Phi-2 is therefore a lot more practical to do research with, which Microsoft is now making possible through the Azure AI Studio.
Better data, fewer parameters
Phi-2 is trained on a carefully chosen selection of “textbook quality” training data. Where predecessor Phi-1 focused solely on coding, this new model has been built with the same philosophy to be able to perform on a variety of tasks. Microsoft did not apply reinforcement learning from human feedback after the 14-day training process (on 96 Nvidia A100 GPUs). In other words, no fine-tuning afterwards, while still boasting strong results. In addition, Phi-2 was less likely to turn toxic in its choice of words than, say, Llama 2-7B.
The conclusion is clear: not more parameters, but better training data provides an opportunity to further develop AI models. That line of thinking was already becoming prominent in April of this year. For example, OpenAI CEO Sam Altman stated that with the development of GPT-4 (with well over a trillion parameters), it was already clear that more parameters would not be endlessly helpful for better outputs. That same month, Databricks presented Dolly 2.0, a model with only 6 billion parameters that still showed up with some impressive results because of a high-quality dataset.
Phi-2: not a new Galactica
A year ago, just before ChatGPT took the world by storm, Meta presented Galactica. This AI model was characterized with much fanfare as a useful tool for researchers. It could summarize scientific papers, create mathematical formulas and much more. However, there was a lot of nonsense among the answers generated, as several examples showed.
Cameron Wolfe, director of AI at Rebuy Engine, hopes Phi-2 will not receive the same backlash as Galactica. He argues on X that Phi-2 could be a new go-to starting point for the use of AI among academics, partly due to the fact that the model is open-source to researchers.
Microsoft’s goal in this case is explained in a lot less boastful and ambitious terms than what Meta had to say in 2022. The point that the Microsoft team is making is that SLMs are beginning to be a meaningful alternative to the immense models of companies such as Google, Meta and OpenAI. It also shows what Microsoft itself has to offer in terms of generative AI, as it has been leaning heavily on OpenAI to put AI into virtually every application from its own suite.
Either way, it at least doesn’t have a misleading video on offer to show Phi-2’s performance, as Google did recently.
Tip: Google changes its mind and lets Gemini compete with GPT-4 immediately