Skip to main content
AI Agent Runs Malicious Code from Error LogsIncident
5 min readFor Security Engineers

AI Agent Runs Malicious Code from Error Logs

What Happened

Tenet Security discovered a new attack pattern called "Agentjacking" that weaponizes AI coding agents against their own development teams. The attack works by injecting malicious code into Sentry error reports, which AI agents then retrieve and execute when developers ask them to investigate application errors.

The attack exploits the Model Context Protocol (MCP), which allows AI coding agents to query external data sources like Sentry. When a developer asks their AI assistant to "check what's causing these errors," the agent pulls error data from Sentry and processes it as trusted input. If an attacker has injected malicious code into those error messages, the agent executes it on the developer's machine.

Tenet Security tested this against real-world targets and achieved an 85% exploitation success rate. They identified at least 2,388 organizations with exposed Sentry Data Source Names (DSNs) that could be exploited through this technique.

Timeline

Discovery Phase
Tenet Security identified the vulnerability in how AI agents process external data through MCP integrations. They developed proof-of-concept code that could inject malicious payloads into Sentry error reports.

Testing Phase
Researchers tested the attack against exposed Sentry DSNs and documented the 85% success rate. They found thousands of organizations with publicly accessible error reporting endpoints.

Disclosure
Tenet Security reported the vulnerability to Sentry. The company responded by implementing a global content filter targeting the specific payload string used in the proof-of-concept.

Current State
Sentry's filter blocks the known exploit signature, but the underlying architectural issue remains: AI agents treat external data sources as trusted without verification.

Which Controls Failed or Were Missing

Input Validation
The AI agents failed to sanitize or validate data retrieved from Sentry before processing it. This violates secure coding principles—the agent treated error logs as trusted input rather than potentially hostile data.

Least Privilege
The agents ran with the same permissions as the developer's user account. When the malicious code executed, it inherited full access to the developer's filesystem, credentials, and network access. No sandboxing or permission boundaries existed between the agent's operations and the host system.

Data Classification
Organizations failed to treat Sentry DSNs as sensitive credentials. These endpoints were discoverable through public repositories and web searches, allowing attackers to inject malicious errors without compromising any infrastructure.

Security Monitoring
Traditional endpoint detection and response (EDR) tools missed the attack entirely. The malicious code execution appeared as legitimate AI agent activity—reading error logs and performing file operations. No anomalous network connections or process trees triggered alerts.

Trust Boundaries
The architecture assumed that data from "official" integrations like Sentry could be trusted. No verification mechanism existed to confirm that error reports came from legitimate sources or hadn't been tampered with.

What the Relevant Standards Require

OWASP ASVS v4.0.3 — Requirement 5.1.1
"Verify that the application has defenses against HTTP parameter pollution attacks, particularly if the application framework makes no distinction about the source of request parameters."

While this requirement targets HTTP parameters specifically, the principle applies: your application—including AI agents acting on your behalf—must treat all external input as untrusted. Error logs from Sentry are external input.

OWASP ASVS v4.0.3 — Requirement 5.2.1
"Verify that all untrusted HTML input from WYSIWYG editors or similar is properly sanitized with an HTML sanitizer library or framework feature."

Extend this thinking to AI agent inputs. Data retrieved through MCP or similar protocols needs the same scrutiny you'd apply to user-submitted HTML. The agent must sanitize before processing.

NIST 800-53 Rev 5 — Control AC-6 (Least Privilege)
"Employ the principle of least privilege, allowing only authorized accesses for users (or processes acting on behalf of users) which are necessary to accomplish assigned tasks."

Your AI coding agent doesn't need full filesystem access to review error logs. It should run in a restricted context with explicit permissions for only the operations required. If it needs to write code, that write operation should require explicit approval.

ISO/IEC 27001:2022 — Control 8.24 (Use of Cryptography)
While not directly about cryptography, this control family addresses data integrity. Organizations must ensure that data retrieved from external sources hasn't been tampered with. For AI agents, this means verifying the authenticity and integrity of data before processing it.

Lessons and Action Items for Your Team

Audit Your AI Agent Integrations
List every external data source your AI coding agents can access. For each integration, document what data the agent retrieves and whether that data undergoes validation before processing. Sentry isn't the only risk—any MCP integration or API the agent queries represents a potential injection point.

Implement Agent Sandboxing
Run AI agents in containers or virtual environments with restricted permissions. The agent should not have write access to your source code repositories, credential stores, or production systems without explicit approval gates. Tools like Docker or Firecracker can create these boundaries.

Treat Agent Inputs as Hostile
Apply the same input validation rules to AI agent data sources that you apply to user inputs. If you wouldn't trust a user to submit arbitrary code through a web form, don't trust Sentry error logs or GitHub issue comments to contain safe data. Implement allowlists for expected data patterns and reject anything that doesn't match.

Rotate and Protect Integration Credentials
Your Sentry DSNs, API keys, and other integration credentials should be treated as secrets. Don't commit them to public repositories. Rotate them regularly. Consider whether they need to be accessible at all—if your AI agent only needs read access to error logs, create a dedicated read-only credential rather than using your full-access DSN.

Add Execution Approval Gates
Configure your AI agents to require human approval before executing code or making system changes. The agent can suggest fixes, but a developer should review and approve before execution. This breaks the automated attack chain.

Monitor Agent Behavior
Traditional EDR won't catch this, but you can build custom detection. Log all AI agent operations: what data sources it queries, what code it executes, what files it modifies. Alert on anomalies like the agent accessing unusual API endpoints or executing code patterns it hasn't used before.

Test Your Defenses
Sentry implemented a filter for the known payload string, but signature-based defenses fail against variations. Test whether your agents properly sanitize inputs by crafting your own malicious error messages in a development environment. Can you inject code that the agent executes? If so, your validation isn't working.

The Agentjacking attack demonstrates that AI coding agents inherit all the security requirements of any other software component in your environment—plus new risks from their ability to autonomously query external data and execute code. Treat them accordingly.

Topics:Incident

You Might Also Like