7 min Applications

What’s wrong (and right) with AI coding agents

What’s wrong (and right) with AI coding agents

The enterprise software application development space is rife with discussion that gravitates around the use of (predominantly agentic, but potentially other) artificial intelligence services that can now be used to “cut code” and create our software apps and data services for us. But as much as there is excitement and anticipation in this arena, there is also dissent and disagreement. After all, an ability to type on whack keystrokes down over clacky keyboards has never been a factor known to hold developers back, so why would automated code generation actually be useful when it’s really a question of requirements, architecture, provisioning, debugging and maintenance. Techzine Global spoke to a handful of developer luminaries for the inside track on computer-generated computer code.

Dan Lorenc, CEO and co-founder of software supply chain security company Chainguard agrees that generating code is the easy part. 

“Agents can crank out more proposed changes in a day than a human team used to ship in a week. Code development itself is no longer the bottleneck, but trust is. When change becomes abundant, confidence becomes scarce…. and if you can’t trust what’s being generated, speed just turns into chaos,” said Lorenc.

Guardrails to guide us

He further suggests that software engineering is going to start looking more like Continuous Integration (CI) engineering. 

“This is a state where we see that the teams that move fastest will be the ones with clear tests, tight review policies, automated enforcement and reliable merge paths. Those guardrails are what make AI useful. If your systems can automatically catch mistakes, enforce standards, and prove what changed and why, then you can safely let agents do the heavy lifting. If not, you’re just accelerating risk,” said Lorenc.

Tablestakes toolkit tools

Brian Fox, CTO and co-founder at open source risk management for secure development pipeline company Sonatype has commented on this space in line with the fact that organisations, including Google (with Google Conductor) and Anthropic with Claude Code, have introduced automated code review checking services. He agrees that automated code review isn’t just tablestakes – it’s necessary. 

“At the scale AI is generating pull requests today, humans simply can’t keep up. You don’t check the accuracy of Excel with an abacus… and in 2026 we shouldn’t expect maintainers to manually inspect machine-speed code without machine-speed assistance,” said Fox. “AI reviews can go deeper than humans in many cases. They don’t get tired, they can reason across large codebases… and they can spot patterns at a scale no individual reviewer can hold in their head. If AI is generating more code, the only viable answer is to use AI to help review and validate it. You have to fight fire with fire.”

Down with the differentiated “diffs”

But, he says, we also need to be clear about where the real risk lives. Most modern applications are largely assembled from open source components. In his view (albeit a rather supply chain-centric view of the agentic coding space), reviewing first-party diffs (see below) is important, but the bigger systemic risk sits in the supply chain – in the dependencies being introduced and the trust decisions being made upstream.

EXPLANATORY NOTE: In software application development, “diffs” (or quite simply differences) are specific code changes made by developers within a team’s own proprietary codebase and they are driven by the code review or Pull Request (PR) process.

“The opportunity isn’t just smarter PR comments. It’s embedding policy and supply chain intelligence directly into the development workflow so risky components and unsafe versions are prevented before they ever make it into a build. AI shouldn’t just accelerate development. It should make secure and compliant choices the default,” added Fox.

Memgraph’s CEO Dominik Tomicevic is always vocal on this subject. As the man steering this specialist in in-memory graph database processing streaming data for real-time analytics insights, he is used to working at speed.

Don’t optimise the code load

“Coding’ is the wrong thing to optimise for… and AI dev tools that only boast about lines of code generated don’t help the CIO (or the rest of the team) at all. What matters is better architecture, closer alignment to business context and the highest possible standards of security and quality,” said Tomicevic. “If this development genuinely helps teams make better architectural decisions, ship safer systems and align software much more tightly to business reality, it’s a positive step and to be welcomed. But if it simply becomes a production line for churning out ever more code, it’s hard to see that as real progress.”

He reminds us that quantity does not always equal quality – especially in the AI-driven world we now live in. He notes that, at least for now, the reality is that AI development tools and ‘vibe coding’ can generate a lot of code very quickly, but code that’s often slower and more memory‑hungry than what a skilled developer would write. So then, perhaps the truth is that the priority isn’t merely faster coding – it’s creating architectures that reflect better architectural decisions. 

AI should learn the data model

“Maybe we’d be better off teaching AI to think more about the data model, rather than just generating more lines of code. If AI helps teams understand their systems as connected graphs – of services, data and dependencies – they can cut zombie assets, optimise critical paths and build software that’s far more aligned with real business needs. AI maturity, in other words, isn’t just using more models; it’s using AI to build the right foundations underneath them – foundations that are increasingly graph-shaped, reflecting the deeply interconnected nature of modern systems,” explained Tomicevic.

“If there’s some basic economics behind us wanting to use computers to write our code ‘for free,’ there’s unfortunately another economic principle we need to consider: it’s the famous Jevons Paradox that we saw in the industrial revolution. It shows that as we increase the efficiency of a resource, total consumption of that resource often rises instead of falling.

“In other words, we’ll end up with increasingly more software if it’s easier to generate, which could make keeping track of it and QA-ing it exponentially harder and the more code we have, the harder it will be to prove it and get value from it. Technology activist Cory Doctorow may be right – that software isn’t an asset but a liability that needs endless patching and maintaining and AI could exponentially multiply that challenge,” added Tomicevic.

Architecture above all

Although this entire discussion is focused on the now-increasingly-automated command line, it feels like the real focus should be higher and architecture has been mentioned already. 

“We’re entering a world where, with AI, software changes are propagating faster than governance models can track them. That means AI tools are, plain and simple, accelerating systemic complexity. When an AI agent can generate and deploy changes across interconnected enterprise systems, there’s real danger in the invisible dependencies and downstream effects most orgs can’t fully see,” said Ido Gaver, CEO and co-founder of Sweep, a software platform known for its ability to manage software development workflows and automate tasks.

“The conversation shouldn’t be limited to package risk or upstream vulnerabilities. The architectural issue is visibility. If enterprises don’t understand their metadata relationships, automation chains, and permission structures inside their own systems, AI will only amplify that opacity. Agentic velocity without architectural clarity is how instability spreads and why we’re seeing such high failure rates on these tools,” concluded Gaver.

There is much to learn here and the changes afoot will manifest themselves above the command line, right on our desktops… but the feeling is that we need engineering to start a whole lot lower so that we embed agentic coding functions in the right way before we even start.