Hugging Face pushes for development of locally running AI with SmolLM

Hugging Face unveils SmolLM, a compact language model suitable for local use on personal devices. Such models already exist from Microsoft and Meta, but SmolLM surpasses them in testing.

SmolLM is a new family of language models (LMs) from Hugging Face. It is a suite of three models driven by 135 million, 360 million and 1.7 billion parameters. In a high-end LLM, that number runs into the trillions. Developers themselves do not give exact figures on this feature, but it is estimated that GPT-4 Turbo contains 1.76 trillion parameters, Claude 3 Opus 2 trillion parameters and Gemini 1.5 Pro at least 1.6 trillion parameters.

The number of parameters in AI models is increasing at lightning speed. That makes it almost impossible to set a firm standard on how many parameters need to be in a model to pass as “Large”. In any case, the new Hugging Face family of models is considered a compact family. The largest model, consisting of 1.7 billion parameters, is close to the number of parameters available from GPT-2 from 2019 (1.5 billion parameters).

High-quality training data

SmolLM can deploy these parameters much more targeted than the 2019 OpenAI model. Its performance also exceeds that of recently released models, such as Meta’s MobileLLM, released earlier this month. MobileLLM posted improvements of 2.7 to 4.3 percent, according to benchmark tests comparing it to previous models of the same size. SmolLM alone always contains at least 10,000 more parameters than these it compares itself to from Meta.

Text continues below image

Source: Hugging Face

According to the researchers, the better results in the tests are the result of the good data quality of the training data. For the training, Cosmo-Corpus, Python-Edu, and FineWeb-Edu were combined. These models all have an educational angle and were thoroughly filtered to get from three LLMs to one LM family.

New step toward cloud independence

Hugging Face’s new family of models again improves AI models that can run locally. These models are said to operate completely cloud-independent, thus addressing privacy concerns. Such models could boost AI use significantly, especially in business environments where it is crucial that internal data does not leak to the outside world.

Hugging Face hopes to contribute to this field with the launch of SmolLM. The launch is important for developers because the company acts open-source and releases the datasets and training code. This simplifies further development compared to previously available models where “most of the details about the data curation and training of these models are not publicly available.”

Tip: Meta’s MobileLLM optimized for smartphones

Top story

Inside TCS’ digital race behind Formula E

The world of Formula E combines technology and speed with sustainability. It's a blend that Tata Consultancy ...

Erik van Klinken June 27, 2025

Tech calendar

Hugging Face pushes for development of locally running AI with SmolLM

High-quality training data

New step toward cloud independence

Stay tuned, subscribe!

Replatforming virtualized workloads: Do your VMs need a new home?

The AI wave is forcing organizations to rethink their infrastructure

KnowBe4 evolves from security training to human risk management

E-commerce solutions provider puts its own portfolio on display

Intel and Altera aim to bring AI to edge computing with new series of chips

RFID gives optimal insight and overview in both store and warehouse

AI-powered cameras shake up retail

Krijg Volledig Inzicht van Gebruiker tot Cloud met Cisco ThousandEyes

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices