6 min Applications

The AI shift from prompt engineering to flow engineering

The AI shift from prompt engineering to flow engineering

The IT industry is busy working out how to develop, program with and exploit Artificial Intelligence (AI) in our enterprise applications. Key among the techniques being used to create, coalesce and connect AI is prompt engineering i.e. the process of designing detailed input formats (characters, words, phrases, symbols etc.) and channels for AI data so that it produces the optimal output for human interaction and use in any given situation. But just as we’re getting our heads around prompt engineering, some argue that we need to also embrace flow engineering as well, so what is this second technique and why will it matter to our new AI universe?

We know that prompt engineering requires developers to use precise wording and structuring when generating code. While well-crafted prompts are widely agreed to significantly enhance Large Language Model (LLM) performance at code generation, software application developers are more likely to get something close to what they want rather than exactly what they want and not something entirely useful. 

Close, not exactly a cigar

This ‘close-almost’ insight is perhaps the most insightful additional caveat and clarification we can get in the current AI engineering discussion as it now plays out. The proposition itself comes from Itamar Friedman, CEO and co-founder of Tel Aviv-based CodiumAI, an organization known for its code testing platform that also offers AI code completion, search and chat functionally. He suggests that the degree of sensitivity to minor variations in phrasing at the prompt engineering level poses a problem for developers and, further, that is a sign that relying on prompt engineering alone isn’t ideal. 

So what about fine-tuning LLMs and training models to follow directions better?  

“That won’t solve the problem on its own either,” argues Friedman. “Trying to get LLMs to directly generate high-quality, working code based only on a description of a problem is like trying to hit a golf hole-in-one on every shot. It’s not how developers work in real life. Devs write code step by step, try different things, see what works and what doesn’t, and improve on their solutions iteratively. The birth of better and better models will provide incremental improvements, but I believe what’s needed is a change in approach to create coding AI that more accurately reflects how developers work in real life (IRL).  

Time to think slow?

To consistently generate high-quality code that does what we intend, the CodiumAI team suggest that we need to move away from expecting LLMs to generate immediate responses and instead have them undergo the same process a human developer would when tackling a complex problem. 

Daniel Kahneman in his book Thinking, Fast and Slow introduced the idea of System 1 and System 2 thinking processes.

  • System 1 is a human being’s intuitive, automatic mode of thinking. It’s fast, effortless and helps with everyday tasks. 
  • System 2 is more deliberate, involving a human taking multiple reasoning steps. It’s slower and requires conscious effort, but is more effectively used for tasks that involve careful reasoning and problem-solving. 

“Prompting an LLM to get an immediate solution can be likened to System 1- it operates automatically to generate an ‘intuitive’ solution almost instantly. But to create AI that can solve difficult problems we need to move to System 2 i.e. an agent-like tool that goes through a multi-step process, often referred to as a ‘flow’, before delivering a final answer,” clarified Friedman. 

Each step in the automated flow process can involve prompting an LLM — but can also involve using ‘classic’ programming to manipulate text, or ‘tools’ like attempting to compile and run generated code.  

Embracing flow engineering 

“Taking all this contextualisation, clarification and validation into account then, this is why we should shift our focus from prompt engineering — attempting to craft the perfect prompt for a particular problem, onward to flow engineering — focusing on designing the best flow with the correct steps for solving the problem. This methodology will take AI coding to the next level as it emulates how developers truly work and piece out problems using iterative problem-solving,” said Friedman.

He asserts that for coding problems, AI-focused software application developers and data scientists need to employ a dedicated code-generation and testing-oriented flow, that revolves around an iterative process that repeatedly runs and fixes generated code against input-output tests much like a developer would do if they were manually writing code. 

“Validating generated code by running and testing as part of a flow ensures code integrity —  that the code runs and does what is expected and is free of bugs. This is a far more robust way of generating code that empowers developers and can hugely improve the software development process,” said Friedman. 

In real life (use cases) 

The CodiumAI team say that they have seen the benefit of flow engineering firsthand in its research using LLMs to solve competitive coding problems – AlphaCodium. With AlphaCodium, the company says it has moved from a naive prompt-answer paradigm to a ‘flow’ paradigm, where the answer was constructed iteratively. They focused on a test-based, multi-stage, iterative process.  

“Two key elements for the success of this specific code-oriented flow were (a) generating additional data in a pre-processing stage, by doing self-reflection and reasoning about the supplied tests, to aid the iterative process and (b) enrichment of the supplied tests with additional AI-generated tests. Using this method on the CodeContests dataset, containing around 10,000 competitive programming problems, our results showed the best approach to code generation seen yet. The AlphaCodium flow outperformed the previous best approach, Google DeepMind’s AlphaCode, by a significant margin and even performed better than the average developer competing in coding competitions,” stated Friedman, in conclusion. 

If prompt engineering can be said to be a comparatively knee-jerk action (albeit a fairly well-thought-out and directed jerk) to pushing AI in one direction or another, then surely flow engineering is the natural evolution of those staccato jumps and nudges into a more mellifluous harmony. After flow engineering, we’ll no doubt be looking to orchestrated AI fabric engineering – please don’t look that up, it’s not real – yet.

Free image: Wikimedia Commons