A new attack technique centered on the rapidly growing OpenClaw ecosystem exposes flaws in existing security measures.
Analyses show that attackers can exploit the way these AI-driven agents operate without traditional solutions such as EDR, DLP, or IAM flagging anything suspicious. This shifts the threat model from recognizable malware to seemingly legitimate actions that have been manipulated in content.
The core of the problem lies in how OpenClaw agents process instructions. In one scenario, a malicious command is hidden within a seemingly innocent email. An agent processes the content but, unnoticed, executes an additional instruction, such as forwarding sensitive data. Because this occurs via standard API calls with valid permissions, it remains invisible to the security stack.
Moreover, the adoption of OpenClaw is happening faster than many organizations realize. Research shows that a significant proportion of employees use such tools without IT approval. At the same time, the number of publicly accessible installations is growing rapidly, and many extensions contain vulnerabilities, which further increases the risk, VentureBeat explains.
Design choices create structural vulnerabilities
Security experts indicate that the platform was not designed from the ground up with strong security. Improvements are underway, but fundamental problems persist. Three vulnerabilities in particular stand out.
The first is semantic data theft. In this case, it is not the code but the meaning of instructions that is manipulated. The agent acts technically correctly but actually performs a malicious action. Because security systems primarily look at behavioral patterns rather than intent, this remains under the radar.
A second problem arises when multiple agents or extensions share the same context. A manipulated instruction can spread through a chain and take effect later. Research shows that such instructions can embed themselves in working files and only become active during a specific task, which makes detection difficult.
The third risk lies in the mutual trust between agents. When tasks are delegated, robust identity verification is often lacking. If a single agent is compromised, it can impersonate a trusted party and thereby gain further access.
New tools must mitigate risks
In recent weeks, various security solutions have been developed. Some tools focus on monitoring and integrity checks, others on sandboxing and stricter separation of privileges. There are also solutions that scan extensions and improve auditing capabilities.
Yet a significant gap remains. None of the current tools can fully analyze the intent behind an agent’s actions or effectively isolate context flows between agents. As a result, the most advanced attacks remain possible.
Within the security community, work is underway on a new standard for defining extension permissions. Each functionality must explicitly specify in advance which actions are permitted, similar to permissions in mobile apps. This should provide greater control and transparency, although significant architectural changes are needed to address the deeper issues.
For organizations, this means that existing security layers are insufficient. It is likely that OpenClaw or similar platforms are already in use, sometimes outside the purview of IT. Insight into usage and additional measures such as isolation and stricter control over agent actions are necessary.
The rise of agent-based AI thus marks a new chapter in cybersecurity. Whereas security used to focus on detecting malicious code, the focus is shifting toward understanding behavior and intent. It is precisely in this area that existing solutions appear to be insufficiently prepared.