3 min Devops

The inevitable startle response to vibe coding

AI code is growing, code review is not keeping pace

The inevitable startle response to vibe coding

Since February, the term for writing liberal programming code with AI has been “vibe coding.” Enough time has passed now to see that this is not a passing fad. The same applies to all the problems and associated frustrations that vibe coding brings with it.

Thanks to AI, even less talented programmers can realize their software ideas. Where no-code and low-code previously paved the way for what was then lauded as a wave of “citizen developers,” the democratization of coding is now much more realistic. One problem: the IT world is not equipped for it.

No code review

Although LLMs are more capable than they were two years ago, they continue to generate unsafe code. This is already a risky aspect of AI assistance in coding for seasoned developers, let alone laymen. AI code is now being generated at scale, with all the consequences that entails. Code review, a critical element of enterprise-grade coding, is not keeping pace with the explosion of AI code generation.

If it is true that Microsoft AI writes 30 percent of its new code, as CEO Satya Nadella stated, then this imbalance is highly problematic. After all, it is unclear whether even the largest tech companies such as Microsoft and Meta have the capacity to check the extra code in a timely manner.

What exactly is the danger here? Tests by individuals have repeatedly shown alarmingly high percentages of code snippets produced by AI that contain vulnerabilities. For example, this article shows that GitHub Copilot constructs unsafe code in 44 percent of cases, often with serious vulnerabilities that attackers love to exploit. Prime examples are out-of-bounds writes and SQL injections, exactly the kinds of vulnerabilities that are often present in exploits.

You may wonder whether this is predictable. Although the outputs of GenAI models are by definition not deterministic, patterns in the errors made will undoubtedly be detectable, even if they differ from human-generated vulnerabilities. This could well be the key to success, with the large scale of AI code having the advantage of making it increasingly clear where GenAI is going wrong.

The question is whether the rapid development of GenAI allows for such predictability. Agentic AI complicates this in any case, because it descends into all kinds of software layers and therefore makes more complex and harder to detect errors. For the time being, this has not been investigated in the same way as the flat AI-generated code of LLMs, so we are curious to see whether the errors made are truly inscrutable.

Conclusion: finding balance

Vibe coding undoubtedly brought a breath of fresh air and optimism. Suddenly, the barrier to software creation seems lower than ever. But the challenges of keeping code secure remain. At the same time, the challenge of finding enough staff and paying them in line with the market is also growing.

It is therefore clear that AI assistance in coding is not going away anytime soon, regardless of the dangers. The reality is that tooling will have to respond to “vibe-coded” contributions, whereby predictable AI vulnerabilities may be just as systematically eliminable as they are introduced. This will require a new paradigm around code review. For now, that requires a human touch.

Read also: ‘Vibe coding’ increasingly standard for code development