ServiceNow and Hugging Face have announced the large language model StarCoder. The new LLM is supposed to aid in code generation. It is open‑access, open‑science, open‑governance and makes generative AI more transparent and accessible, the companies claim.
ServiceNow is best known in the AI community for developing machine language models. It along with Hugging Face touted the new release as “one of the world’s most responsibly developed and strongest‑performing” LLMs available for use to generate code.
The open‑governance platform boasts 15 billion parameters, the companies said, and was designed to make generative AI “more transparent and accessible to enable responsible innovation at scale”.
Introducing the BigCode Project
The new LLM is part of the BigCode Project, a joint initiative to develop state‑of‑the‑art AI systems for code “openly and responsibly” and with the support of the open‑scientific AI research community.
ServiceNow Research and Hugging Face launched the joint BigCode Project in September 2022. The project continues to operate as an open scientific collaboration. The two companies seek to “harness the collective brainpower and resources from the open‑source community” through BigCode working groups, task forces, and meetups.
StarCoder is trained with a trillion tokens of permissively licensed source code covering over 80 programming languages from BigCode’s The Stack v1.2 dataset, according to the companies. The LLM can be deployed to bring pair‑programing, like generative AI, to applications with capabilities like text‑to‑code and text‑to‑workflow.
Supporting code has been open-sourced on the BigCode project’s GitHub.
Promoting “responsible innovation” in AI
Harm de Vries, lead of the Large Language Model Lab at ServiceNow Research and co‑lead of BigCode, hailed the new LLM. “New, responsible AI practices to train and share large language models are vital to ensuring the right protocols, safeguards, and permissive licenses are in place for our customers, and StarCoder is making this possible”, he said.
Leandro von Werra, machine learning engineer at Hugging Face and another co‑lead of BigCode, echoed the sentiments of de Vries. “This endeavour is a testament to the potential of open‑source as we work toward democratizing AI”, he said.