Skip to main content
Prompt Injection Flaws in Salesforce Agentforce and Microsoft Copilot: A Security TeardownIncident
4 min readFor Compliance Teams

Prompt Injection Flaws in Salesforce Agentforce and Microsoft Copilot: A Security Teardown

Two recently fixed prompt injection vulnerabilities in Salesforce Agentforce and Microsoft Copilot could have allowed external attackers to leak sensitive data from enterprise systems. Here's what happened, which controls failed, and what your team needs to do differently.

What Happened

Security researchers discovered prompt injection vulnerabilities in two major AI agent platforms: Salesforce Agentforce and Microsoft Copilot. Both vendors have since patched these flaws. The vulnerabilities enabled attackers to manipulate the AI agents' behavior through crafted inputs, potentially exfiltrating sensitive data like customer records, internal documents, and authentication tokens.

Prompt injection works by embedding malicious instructions within user input that the AI model interprets as commands. Unlike traditional injection attacks that exploit parsing weaknesses, prompt injections exploit the AI model's inability to distinguish between trusted system prompts and untrusted user content.

Timeline

The exact discovery and disclosure timeline for these vulnerabilities has not been publicly detailed. However, both Microsoft and Salesforce released patches after security researchers reported them through responsible disclosure channels.

Which Controls Failed or Were Missing

Input Validation at the AI Boundary

The primary failure was inadequate separation between user-controlled input and system instructions. Your AI agent needs to treat all external input as untrusted data. These systems lacked controls to prevent user input from being interpreted as commands to the AI model itself.

Least Privilege for AI Agents

AI agents in both platforms had access to necessary data, but lacked granular controls to limit what an attacker could extract if they compromised the agent's decision-making process. When you give an AI agent access to your CRM, email, or document repository, you create a new exfiltration path that needs explicit controls.

Monitoring and Anomaly Detection

These vulnerabilities could have been mitigated through behavioral monitoring. If your AI agent starts accessing unusual data patterns, making unexpected API calls, or producing atypical outputs, that's a signal. The absence of runtime monitoring for AI agent behavior meant these attacks could proceed undetected.

What the Standards Require

OWASP Top 10 for LLM Applications

The OWASP Top 10 for LLM Applications lists prompt injection as LLM01—the number one risk. The guidance is explicit: implement privilege controls, validate and sanitize all inputs, and use indirect prompt injection defenses when processing external content.

ISO/IEC 27001:2022 Control 8.16 (Monitoring Activities)

ISO/IEC 27001:2022 Control 8.16 requires monitoring networks, systems, and applications for anomalous behavior. This applies to your AI agents. You need logging that captures what data the agent accessed, what actions it took, and what outputs it generated.

NIST CSF v2.0 Function: Detect (DE.CM-7)

The NIST CSF Detect function requires monitoring for unauthorized behavior. When an attacker uses prompt injection to alter your agent's behavior, that's unauthorized software behavior. Your detection controls need to extend to AI systems.

PCI DSS v4.0.1 Requirement 6.4.3

If your AI agent processes payment data, PCI DSS Requirement 6.4.3 applies: you must protect against injection attacks. Prompt injection is an injection attack. Your web application firewall or API gateway needs rules that can detect and block prompt injection attempts.

Lessons and Action Items for Your Team

Map Your AI Agent Attack Surface

Create an inventory of every AI agent or AI-powered feature in your environment. Document:

  • What data sources it can access
  • What actions it can take
  • What user input it processes
  • Whether it processes external content

This is your AI supply chain. You can't secure what you haven't identified.

Implement Prompt Injection Defenses

Add a validation layer between user input and your AI model. Use a secondary model or rule-based system to analyze user input for injection attempts before passing it to your primary AI agent. Assume external content is adversarial and implement content filtering.

Apply Least Privilege to AI Agents

Your AI agent doesn't need access to your entire database. Implement role-based access control at the data layer. Use API gateways to enforce access policies independent of the AI model's behavior. Create separate service accounts for each AI agent with minimum necessary permissions.

Build AI-Specific Monitoring

Extend your SIEM to capture AI agent activity. Log every prompt, data access, and action taken. Create baselines for normal behavior and alert on anomalies.

Test for Prompt Injection in Your Security Program

Add prompt injection testing to your application security testing program. Include prompt injection scenarios in your penetration testing scope. Your red team should attempt to manipulate AI agent behavior just as they would attempt SQL injection or XSS.

The Salesforce and Microsoft incidents demonstrate that prompt injection is a real vulnerability affecting production enterprise systems. Your security program needs to evolve to address it.

Topics:Incident

You Might Also Like