4 min Devops

OpenAI unveils Codex Security to detect vulnerabilities in AI code

OpenAI unveils Codex Security to detect vulnerabilities in AI code

OpenAI is introducing a security tool for software development called Codex Security. The application, which is currently available in a research preview, is designed to help development teams detect vulnerabilities in code faster and more accurately.

According to OpenAI, the tool focuses primarily on reducing false positives and better prioritizing actual security risks.

The tool uses OpenAI’s AI models and an agent-based approach to analyze security issues in the context of an entire codebase. Instead of just flagging individual vulnerabilities, the system attempts to understand how an application is structured and which parts of the system pose the greatest risk.

Following in Anthropic’s footsteps

With this tool, OpenAI aims to tackle a well-known problem in application security, namely the large number of reports with limited impact that security teams must first assess manually. Anthropic launched a similar tool at the end of February, called Claude Code Security.

Developers can use Codex Security by giving the tool access to a repository that needs to be scanned. According to OpenAI, the system makes a temporary copy of the code in an isolated container in which the analysis is performed. In some cases, this analysis can take several days, depending on the size of the codebase.

The system then builds a threat model specific to the project. That model consists of a comprehensive description in natural language of how an application works and where potential attack points lie. For example, the model can identify components where users can upload data or have other interactions with the system.

Such components often pose a potential risk because they process input from outside the system. Development teams can customize the threat model to add additional context or to prioritize certain parts of the application during the analysis.

Threat model guides vulnerability scans

Codex Security uses this threat model to search for potential vulnerabilities. Any issues found are tested in a sandboxed environment to determine whether they can actually be exploited. Based on these tests, the system filters out false positives and ranks the remaining vulnerabilities by severity.

The system also keeps log files of problems that fail the sandbox test. Developers can use these logs later to investigate potential vulnerabilities that were accidentally flagged as false positives.

When a vulnerability is confirmed, Codex Security generates a proposed solution. That proposal includes both the code needed to fix the problem and an explanation in natural language.

According to OpenAI, the technology has already uncovered a number of concrete security issues during the beta phase, including a vulnerability to server-side request forgery and a critical authentication error between different tenants. During internal testing, these issues were quickly resolved after they were discovered.

Development began as an internal security tool

The technology originally started as an internal tool called Aardvark, which OpenAI used to analyze its own code. This was followed by a limited beta with external customers. During that testing phase, the company managed to reduce the number of false positives by more than half, according to its own figures.

In the past month, Codex Security analyzed more than 1.2 million commits in external repositories. This identified hundreds of critical vulnerabilities and more than ten thousand high-severity issues. According to OpenAI, critical issues occurred in less than 0.1 percent of the scanned commits.

SiliconANGLE adds that the tool also found vulnerabilities that were serious enough to be included in the CVE database. A total of fourteen discovered issues have been assigned CVE numbers.

To further support the open source ecosystem, OpenAI is also launching a program that gives maintainers access to ChatGPT Pro or Plus and to Codex Security. The intention is for developers to be able to use the tool as part of their normal code review process.

Codex Security is currently being rolled out as a research preview for ChatGPT Enterprise, Business, and Edu customers via the Codex web interface. According to OpenAI, the company is collecting user feedback during this phase to further improve the technology.