2 min

Moreover, the chatbot fails to alert users to its coding defects – even though it could do so.

This week The Register reported on research that shows ChatGPT not only produces mostly insecure code but also fails to alert users to its inadequacies. The striking thing is that it is perfectly capable of doing so. Four scientists at the Canadian Université du Québec analyzed the code generated by OpenAI’s large language model from a security standpoint. They posed the question: “how secure is code generated by ChatGPT?”. The answer: not very.

The abstract provides further details. “We ask ChatGPT to generate several programs and evaluate the security of the resulting source code. We further investigate whether ChatGPT can be prodded to improve the security by appropriate prompts and discuss the ethical aspects of using AI to generate code”.

What they found was not reassuring. “Results suggest that ChatGPT is aware of potential vulnerabilities, but nonetheless often generates source code that is not robust to certain attacks”.

“Worrisome” results

The authors describe their results as problematic. “We found that, in several cases, the code generated by ChatGPT fell well below minimal security standards applicable in most contexts. In fact, when prodded to whether or not the produced code was secure, ChatGTP could recognize that it was not.”

The researchers assigned various programming tasks to ChatGPT that could highlight various types of security vulnerabilities. Examples range from memory corruption, denial of service, and flaws related to deserialization and improperly implemented cryptography.

ChatGPT generated just five secure programs out of 21 on its first attempt. Moreover, the researchers say, “ChatGPT seems aware of – and indeed readily admits – the presence of critical vulnerabilities in the code it suggests.” The problem is that it will only reveal those vulnerabilities if it is specifically asked to evaluate the security of the suggested code.

Raphaël Khoury, a university computer science and engineering professor, told The Register: “Obviously, it’s an algorithm. It doesn’t know anything, but it can recognize insecure behaviour”.