3 min Security

Fake stars undermine GitHub: 4.5 million fraudulent stars discovered

Fake stars undermine GitHub: 4.5 million fraudulent stars discovered

GitHub is grappling with the problem of fake stars artificially inflating the popularity of scam and malware distribution repositories, which allows them to reach more unsuspecting users.

This reports BleepingComputer. Stars on GitHub function like Like buttons, allowing users to mark a repository as a favorite. GitHub uses these stars as part of a global ranking system. And recommend related content that users might find interesting. Users can give repositories and topics stars to discover similar projects.

Problem with fake stars

The problem has been documented before, like last summer when Check Point discovered a malware delivery service called Stargazers Ghost Network. This network used an elaborate system of fake users who gave stars to fake projects. This was to spread information-stealing malware. By the way, even non-malicious projects use fake stars to boost their popularity, increase their reach and attract real users.

A new study by researchers from Socket, Carnegie Mellon University and North Carolina State University reveals the extent of this problem. The study shows that 4.5 million stars on GitHub are believed to be fake.

Detecting fake stars

The researchers developed a “StarScout ” tool to analyze 20 TB of data from “GHArchive” and identify fake stars. GHArchive contains metadata from over 6 billion GitHub events from July 2019 to October 2024, including 60.5 million user actions on 310 million repositories and 610 million stars.

StarScout tracks users with low activity on GitHub, such as those who only give stars to one repository, accounts with bot or temporary account behavior, and groups of accounts that act in a coordinated manner, for example, by staring the same repositories within a short period.

The method is based on CopyCatch, an algorithm designed to detect fraudulent patterns on social networks.

4.5 million suspicious stars

After processing the data using low-activity and lockstep algorithms, the team discovered 4,530,000 suspicious stars from 1,320,000 accounts across 22,915 repositories.

To increase the reliability of the results, the researchers filtered out possible false positives. They considered only repositories with a significant spike in starring activity in one month and whose percentage of fake stars was above 10%. This reduced the results to 3,100,000 fake stars from 278,000 accounts for 15,835 repositories.

Of these repositories and accounts, about 91% of repositories and 62% of suspicious accounts had been deleted by October 2024, supporting the accuracy of the StarScout tool.

Suspicious repositories

The study also shows that fake star activity increased sharply in 2024. In July 2024, about 15.8% of repositories with more than 50 stars were involved in these rogue campaigns.

The researchers reported the suspicious repositories and accounts that StarScout identified in July 2024. GitHub removed those, but additional clusters are still being evaluated and reported, found in November 2024.

Reduced trust in GitHub

Fake stars undermine trust in GitHub and the projects hosted on it. Users are advised to look beyond stars and evaluate repositories for activity, quality, documentation, content, contributions and code.

Misleading GitHub repositories are widespread and have even been deployed in state-sponsored operations. Therefore, caution is advised when downloading software from the platform.

BleepingComputer has contacted GitHub for more information on how the platform is actively combating fake stars, but it is still waiting for a response.

Tip: GitHub launches business version of Github Copilot