Agentjacking: Prevent AI Code Execution via Poisoned Logs

A developer at a Fortune 100 technology company opened their AI coding assistant one morning. Within minutes, an attacker was executing code on their machine—no credential theft, no malware download, no infrastructure compromise required.

This is agentjacking, and it works because AI coding agents trust data they shouldn't.

What Happened

Researchers at Tenet Security discovered that attackers can exploit public Sentry Data Source Names (DSNs) to inject malicious commands into AI coding agents like Claude Code, Cursor, and Codex. The attack chain is straightforward:

Attacker identifies a target organization's public Sentry DSN (used for error monitoring).
Attacker submits crafted error messages containing malicious instructions to that DSN.
When a developer's AI agent queries Sentry through the Model Context Protocol (MCP), it retrieves the poisoned error logs.
The AI agent interprets the malicious instructions as legitimate context and executes them.

Tenet confirmed execution on a machine inside a $250 billion Fortune 100 technology company. Their passive reconnaissance found 2,388 organizations with injectable DSNs. During controlled validation, they achieved an 85% success rate across multiple waves.

Timeline

Discovery Phase: Tenet Security identified that public DSNs—intentionally designed as non-sensitive—become attack vectors when AI agents consume their data.

Validation Phase: Controlled testing against organizations that had consented to security research. Tenet achieved 85% successful code execution across validation waves.

Disclosure: Tenet reported findings to affected platform providers and model vendors. The vulnerability affects any AI agent using MCP to integrate with external services where attackers can inject data.

Current State: As of disclosure, no comprehensive fix exists at the protocol level. Individual organizations must implement their own controls.

Which Controls Failed or Were Missing

Input Validation: AI agents consumed external data without treating it as untrusted input. The agents had no mechanism to distinguish between legitimate error logs and attacker-controlled content.

Least Privilege: Agents operated with the developer's full system permissions. When the agent executed commands, it did so with unrestricted access to the developer's files, credentials, and network.

Monitoring and Detection: Traditional security controls failed completely:

EDR tools saw authorized user actions, not malicious activity.
WAFs and network monitoring never triggered—the attack used legitimate API calls.
SIEM systems had no baseline for "normal" AI agent behavior.

Separation of Duties: No approval workflow existed between the AI agent receiving instructions and executing system commands. The agent acted autonomously on external data.

Secure Integration: The Model Context Protocol lacks built-in mechanisms to verify data provenance or integrity. Agents cannot determine whether data from an MCP server has been tampered with.

What the Relevant Standards Require

PCI DSS v4.0.1 Requirement 6.4.3 mandates that scripts executing in the browser are managed to prevent tampering. While this requirement targets payment page scripts specifically, the principle extends to any code execution based on external input: you must verify integrity before execution.

OWASP ASVS v4.0.3 Section 5.1.3 requires that applications verify the integrity of data from external sources before use. AI agents retrieving data through MCP violate this requirement by treating external service responses as inherently trustworthy.

NIST 800-53 Rev 5 Control SI-3 addresses malicious code protection. The control requires mechanisms to detect and eradicate malicious code introduced through various vectors. AI agents currently have no equivalent protection for malicious instructions embedded in data.

ISO/IEC 27001:2022 Control 8.24 (Web filtering) and Control 8.7 (protection against malware) both assume traditional threat vectors. Neither framework yet addresses the risk of authorized tools executing attacker-controlled instructions disguised as legitimate data.

The gap is clear: existing standards assume you can distinguish between trusted and untrusted input sources. AI agents blur this boundary by design.

Lessons and Action Items for Your Team

Immediate Actions:

Audit which AI coding tools your developers use and document their external integrations. You cannot secure what you haven't inventoried. Check for MCP configurations connecting to services where external parties can inject data—error monitoring, logging platforms, ticketing systems.

Implement network segmentation for developer machines running AI agents. If an agent executes malicious code, limit its blast radius. Restrict access to production credentials, internal repositories, and sensitive file shares from these endpoints.

Disable or restrict MCP integrations until you can implement monitoring. The convenience of AI agents accessing your error logs is not worth the risk of code execution.

Runtime Controls:

Deploy application allowlisting on developer machines. Tools like Windows Defender Application Control or AppLocker can restrict which executables run, even when launched by an AI agent. This won't stop all attacks, but it raises the bar.

Implement command execution logging that captures the parent process. You need visibility into which commands your AI agents are running. Standard audit logs often miss this context—you need to know that git clone was initiated by your AI agent, not the developer directly.

Process Changes:

Require manual approval for any AI agent action that modifies files, executes commands, or accesses credentials. Yes, this reduces the agent's autonomy. That's the point. An approval step breaks the automated attack chain.

Treat AI agent prompts and responses as untrusted input in your threat model. Apply the same scrutiny you would to user-submitted data in a web application. If you wouldn't execute a shell command based on a POST parameter, don't let your AI agent do it based on an MCP response.

Vendor Accountability:

Ask your AI agent vendors: What controls prevent execution of instructions from compromised external services? How does the agent verify data integrity from MCP servers? What logging exists for agent-initiated actions?

If they cannot answer these questions, you're running unaudited code execution as a service.

Push for MCP protocol extensions that support data signing and provenance verification. The protocol needs cryptographic proof that data came from the claimed source and hasn't been modified.

Detection Strategy:

Build behavioral baselines for your AI agents. What files do they typically access? Which commands do they run? Deviations from these patterns warrant investigation, even if the actions appear authorized.

The hard truth: agentjacking works because it exploits trust, not vulnerabilities. Your AI agents will execute attacker instructions as long as those instructions arrive through channels the agent considers legitimate. Until the protocol layer provides integrity guarantees, your controls must assume external data is hostile.

Sentry DSN Model Context Protocol

Agentjacking: Code Execution via Poisoned Error Logs

What Happened

Timeline

Which Controls Failed or Were Missing

What the Relevant Standards Require

Lessons and Action Items for Your Team

You Might Also Like

npm File Overwrite Flaw: CVE-2019-16775

AI Agent Breached Through SAML Flaw in Legacy IdP

Strict HTTP Parsing Broke Your API: The Node.js CVE-2019-15606 Incident