Securing MCP in AI Agents: Risks and Responsibilities

Context: Questions from teams deploying AI agents

Security channels have been inundated with questions after OX Security disclosed vulnerabilities in the Model Context Protocol (MCP). Over 30 remote code execution issues were reported, with 10 receiving CVE IDs. These vulnerabilities affect six official services and thousands of public servers across 200 GitHub projects.

The questions below come from teams that have deployed MCP-based agents or are considering it. They highlight a critical issue: developers assumed the reference implementation from Anthropic would be secure by default. It isn't.

Q1: "Is this just about one bad implementation, or is the protocol itself broken?"

The protocol specification isn't inherently flawed, but Anthropic's choice of the STDIO transport introduces systemic risk. STDIO allows MCP servers to execute as local processes, passing data through standard input/output streams. This design requires spawning processes with arbitrary commands, and the reference implementation doesn't restrict these commands.

This isn't a single vendor's coding error; it's a design pattern replicated by thousands of developers from official examples. When a framework's "hello world" includes unsafe defaults, every downstream implementation inherits the vulnerability.

What this means for you: If you're using MCP servers that spawn via STDIO, assume they can execute arbitrary commands unless you've explicitly added sandboxing. The transport mechanism itself is the attack surface.

Q2: "Our MCP servers are internal-only. Are we still at risk?"

Yes. Internal-only doesn't mean safe—it means your threat model shifts to compromised internal systems, supply chain attacks, and insider threats.

Consider how MCP servers are configured: through JSON files, environment variables, or API calls. If an attacker compromises a developer workstation or CI/CD pipeline, they can modify MCP server configurations to execute commands with the privileges of the account running your agent. This could be a service account with access to production databases, cloud credentials, or customer data.

Practical step: Map which systems can modify MCP configurations in your environment. Treat those systems as critical security boundaries. If your CI/CD can push new MCP server definitions, that pipeline needs the same controls as your production deployment process.

Q3: "Should we wait for Anthropic to fix this, or do we need to act now?"

Act now. Framework providers may eventually improve defaults, but you own the risk in your deployment today.

The core issue is that security responsibility was implicitly delegated to application developers without clear guidance. Anthropic's documentation didn't warn that STDIO configurations allow command execution. Most teams assumed "official implementation" meant "production-ready security."

Immediate actions:

Audit every MCP server configuration in your environment for STDIO transports.
Implement process-level sandboxing (containers, seccomp profiles, or restricted user accounts).
Add integrity checks for MCP configuration files—treat them like SSH authorized_keys.
If you can't sandbox, switch to SSE (Server-Sent Events) transport, which doesn't spawn processes.

Q4: "We're about to deploy our first AI agent. Should we avoid MCP entirely?"

Not necessarily, but treat it like any other technology prone to remote code execution: databases, message queues, SSH. It requires deliberate security architecture.

The lesson here applies beyond MCP: AI framework security is immature. Providers are optimizing for developer speed and capability, not security defaults. This pattern will repeat with other agent frameworks, tool-calling mechanisms, and context injection systems.

Design principle: Assume any component that spawns processes, evaluates code, or interprets commands is exploitable unless proven otherwise. Your architecture should contain the blast radius when (not if) a component is compromised.

For MCP specifically:

Use SSE transport when possible.
Run MCP servers in dedicated containers with minimal privileges.
Monitor process execution from MCP-spawned services.
Implement network segmentation so compromised agents can't pivot.

Q5: "How do we explain this risk to leadership? They just approved budget for AI agents."

Frame it as a supply chain security issue, not an AI problem. Leadership understands supply chain risk after recent high-profile incidents.

The narrative: "We're integrating a protocol where the reference implementation allows remote code execution by default. This affects thousands of open-source projects that copied the pattern. It's similar to discovering that a widely-used authentication library had a backdoor—except in this case, it's not malicious, just unsafe design."

Risk quantification approach:

Identify what data your AI agents access (customer records, credentials, internal systems).
Map the privilege level of accounts running MCP servers.
Calculate potential impact if those accounts were compromised.
Compare mitigation cost (sandboxing, monitoring) against potential breach cost.

This usually results in a modest investment in containerization and monitoring versus accepting risk that could trigger breach notification requirements.

Q6: "What should we demand from AI framework providers going forward?"

Demand that security is a primary design constraint, not an afterthought in documentation.

Specifically:

Secure defaults: Frameworks should fail closed. If STDIO must exist, it should require explicit opt-in with clear warnings.
Threat models: Every protocol should document its attack surface and trust boundaries.
Security documentation: Not buried in GitHub issues—prominent in getting-started guides.
Sandboxing examples: Reference implementations should demonstrate isolation, not just functionality.

This isn't unreasonable. Database drivers don't default to allowing SQL injection. Web frameworks enable CSRF protection by default. AI frameworks need to adopt the same maturity.

Practical leverage: When evaluating AI tools, ask vendors about their secure-by-default philosophy during procurement. Make it a selection criterion. Vendors respond to customer requirements.

Where to go for more

The CVE IDs issued for these vulnerabilities provide specific technical details if you need to assess impact on particular dependencies. Search the CVE database for MCP-related issues disclosed in the past quarter.

For ongoing monitoring, track the MCP specification repository for security-related discussions. The community is actively debating transport security, but implementation changes lag specification updates.

Your immediate action item: inventory your MCP deployments this week. The configuration audit takes hours; recovering from a breach takes months.

"We just added MCP to our agent—how worried should we be?"

Context: Questions from teams deploying AI agents

Q1: "Is this just about one bad implementation, or is the protocol itself broken?"

Q2: "Our MCP servers are internal-only. Are we still at risk?"

Q3: "Should we wait for Anthropic to fix this, or do we need to act now?"

Q4: "We're about to deploy our first AI agent. Should we avoid MCP entirely?"

Q5: "How do we explain this risk to leadership? They just approved budget for AI agents."

Q6: "What should we demand from AI framework providers going forward?"

Where to go for more

You Might Also Like

Should Your Security Framework Treat AI Agents Like Users or Like Code?

Short-Lived Credentials Won't Save You From Your Real Problem

SBOMs Won't Save You From Supply Chain Risk