2 min Applications

LLMs from Hugging Face now deployable and distributable directly via Cloudflare

LLMs from Hugging Face now deployable and distributable directly via Cloudflare

Cloudflare now allows developers to directly deploy AI applications from Hugging Face on their own platform. Using the Cloudflare Workers AI application, they can further distribute these applications wherever they wish.

Cloudflare and Hugging Face announced their collaboration last year, but the real integration between the two platforms has finally commenced, Cloudflare reports. Within this collaboration, 14 curated open-source LLMs from the AI specialist can now be brought to Cloudflare and distributed easily.

According to both parties, this allows developers of AI applications to develop their applications easier and, more importantly, cheaper. LLMs participating include Mistral 7B v0.2, the Nous Research Hermes 2 Pro enhanced version of Mistral 7B, Google Gemma 7B and Starling-LM-7B beta of OpenChat.

An essential link in this collaboration is Cloudflare’s Workers AI tool, the company’s serverless GPU-supported inference platform. The Workers AI inference tool supports Hugging Face’s 14 selected open-source LLMs for text generation, embeddings, and sentence similarity. The tool distributes the required models when needed, in multiple places simultaneously, without delay or processing errors.

This gives developers the freedom and flexibility to choose the right LLMs and quickly scale their AI applications to global distribution. Workers AI does not only work with Hugging Face’s LLMs.Among others, Cloudflare’s inference platform also works with Meta’s Llama 2 and Mistral 7b.

Availability and future features

The inference platform is now globally accessible by deploying 150 GPUs in 150 cities, including Cape Town, Durban, Johannesburg, Lagos, Amman, Buenos Aires, Mexico City, Mumbai, New Delhi and Seoul.

Cloudflare is still upgrading Workers AI with features to support fine-tuned model data. This will soon allow developers to develop and deploy specialized domain-specific applications.

Also read: Cloudflare Magic Cloud Networking ties public cloud connections together