Anthropic lets AI agents independently develop C compiler

Anthropic has demonstrated the extent of autonomous AI development with a remarkable experiment. Sixteen AI agents built a C compiler almost entirely independently, but the results show both technological progress and clear limitations.

The experiment took place at a time when several AI suppliers are focusing on agentic systems. Both Anthropic and OpenAI recently introduced new tooling for multi-agent use, which means the timing of the publication does not seem coincidental, according to Ars Technica.

In the experiment, sixteen AI agents, all running on Claude Opus 4.6, were tasked with building a C compiler in Rust from scratch. After formulating the goal, the human supervisors largely withdrew. The agents worked in parallel on a shared Git repository, without central orchestration or a controlling main agent.

To make this possible, the company developed its own technical infrastructure. Each AI agent ran in a separate Docker container and worked in an infinite loop, automatically starting a new session after completing a task. Tasks were coordinated among themselves using simple lock files in the repository, so agents did not interfere directly with each other.

Two thousand Claude Code sessions

The project ran for almost two weeks and included approximately two thousand Claude Code sessions. Approximately two billion input tokens were processed, and approximately 140 million output tokens were generated, accounting for nearly twenty thousand dollars in API costs. The end result is a compiler of approximately one hundred thousand lines of code.

According to Anthropic, the compiler can build realistic software. For example, the system successfully compiled a bootable Linux 6.9 kernel for x86, ARM, and RISC-V architectures. Projects such as PostgreSQL, SQLite, Redis, FFmpeg, and QEMU were also successfully compiled. On the GCC torture test suite, the compiler achieved a success rate of approximately 99 percent. As an informal final test, the compiler was even able to compile and run the game Doom.

At the same time, external reports clearly question the degree of autonomy. Although the AI agents wrote code independently, the experiment required considerable human preparation. According to Ars Technica, most of the work was not in the programming itself, but in designing test harnesses, CI pipelines, and feedback mechanisms tailored to the limitations of language models.

In that context, Anthropic emphasizes that the compiler was developed without direct external influences. The AI agents had no internet access during the development process and used only the Rust standard library. The company, therefore, refers to it as a clean-room implementation.

However, this qualification is controversial. Although the development environment was shielded, the underlying language model was pre-trained on large amounts of publicly available source code. This almost certainly includes existing C compilers, test sets, and associated tooling. The use of the term clean room thus deviates from its classic meaning in software development.

These limitations became particularly apparent as the project grew. As the code base approached 100,000 lines, new bug fixes and extensions began to break existing functionality regularly. This pattern, familiar from large human codebases, also appeared here in AI agents that operate autonomously for long periods. The experiment thus suggests a practical scale limit for agentic software development with the current generation of models.

The complete source code is publicly available, and Anthropic explicitly presents the project as research. The experiment shows what is possible with current AI agents, but also where the practical limits of large-scale autonomous software development lie.

JFrog brings order back to a software supply chain under AI pressure

Generative AI and AI agents are rapidly changing the way software is developed. Tools such as Cursor, Claude ...

Berry Zwets December 12, 2025

Spotify puts the brakes on Developer Mode with stricter API rules

Spotify is tightening the rules for developer access, setting a clear course for its platform in 2026. T...

Mels Dees 1 day ago

Top story

What is UCP? Google’s open standard for agentic commerce

Google has launched the Universal Commerce Protocol (UCP), an open standard that enables AI agents to shop au...

Berry Zwets January 12, 2026

Linux kernel to move to version 7.0 after release of 6.19

The Linux kernel will soon receive a new major version number. Linus Torvalds has announced that the kernel w...

Mels Dees 1 day ago

Expert Talks

Anthropic lets AI agents independently develop C compiler

Two thousand Claude Code sessions

Stay tuned, subscribe!

Western Europe is a hotbed for cybercriminals’ servers (update)

Everyone works with AI agents, but who controls the agents?

Silicon One is the engine under the hood of Cisco’s AI story

Multi-agent systems set to dominate IT environments in 2026

What makes Salesforce agents reliable? Architecture explained

SAP's AI workforce strategy: upskilling 100,000 employees

Why your SOC needs a ROC, according to Qualys

IFS builds an industrial AI ecosystem through partnerships

4 steps to create a future-proof data infrastructure

Secure networking: the foundation for the AI era

Why AI adoption requires a dedicated approach to cyber governance

Professional print materials for European tech events, why booth design still makes the difference

Appdevcon

Webdevcon

Dutch PHP Conference

De IT Afdeling van de toekomst

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices