Skip to main content
Microsoft Drops Two Safety Tools for AI AgentsGeneral
4 min readFor Security Engineers

Microsoft Drops Two Safety Tools for AI Agents

Microsoft has released Rampart and Clarity as open-source projects, offering new ways for organizations to enhance AI agent security. These tools address a critical issue: AI agents now execute code, access databases, and modify production systems, yet many teams still treat AI safety as a periodic task rather than a continuous engineering discipline.

Here's what has changed and what your security team needs to do about it.

Integrating AI Safety into CI/CD Pipelines

Rampart and Clarity bring AI safety checks into your CI/CD pipeline. Rampart runs automated red team scenarios against your AI agents during builds, while Clarity validates design assumptions before any code is written. Both tools are built on PyRIT, Microsoft's open automation framework for red teaming generative AI systems.

AI agents are no longer just chatbots—they perform privileged operations. Traditional application security workflows don't account for prompt injection leading to database access or an agent escalating its permissions due to a broad interpretation of a user request.

Addressing Key Security Concerns

1. Prompt Injection as Privilege Escalation

When an AI agent can execute system commands, a successful prompt injection isn't just a policy violation—it's a potential security breach. Rampart continuously tests for this by simulating adversarial prompts against your agent's permissions model. It checks if refusals to inappropriate requests hold up when the agent has database write access.

2. Design Assumptions Under Adversarial Conditions

Clarity focuses on the architecture phase. It requires you to document assumptions, such as an agent only querying read-only endpoints, and generates test cases that challenge these assumptions. This helps identify security gaps before your agent reaches staging.

3. Safety Checks vs. Deployment Speed

Many teams conduct security reviews sporadically, not with every commit. Rampart integrates with CI/CD tools like GitHub Actions, Azure DevOps, and Jenkins, ensuring every merge request includes automated safety validation. If your agent gains access to a new API, Rampart tests for new attack vectors.

4. Open-Source Access for AI Red Teaming

Previously, continuous AI safety testing required custom frameworks or expensive tools. Microsoft's open-source release allows both startups and large enterprises to implement automated safety checks, lowering the barrier from "hire a specialized AI security team" to "add a stage to your pipeline."

Steps for Your Security Team

Your current security practices likely treat AI agents like any other application component, assuming your code does what you programmed it to do. AI agents, however, interpret instructions and make decisions based on context, which can lead to unexpected actions.

This changes your threat model. You're defending against:

  • Agents misinterpreting requests as instructions to bypass controls
  • Prompt injection attacks exploiting operational privileges
  • Agents learning from adversarial input and spreading malicious behavior
  • Permission escalation through natural language manipulation

Your team needs to validate agent behavior continuously. Rampart and Clarity provide the automation needed to make this feasible.

Action Items by Priority

Priority 1: Map AI Agent Permissions

Before integrating safety tools, document what your agents can do. Identify which databases they can query, APIs they can call, and file systems they can access. Rampart's effectiveness relies on testing against your actual permission model.

Create a matrix: agent name, granted permissions, justification, and last review date. If you can't complete this, you're not ready for continuous safety checks.

Priority 2: Integrate Clarity into Design Reviews

Add Clarity to your architecture approval process. Require a Clarity validation report for new or expanded AI agent capabilities. This identifies assumptions like "the agent will only use approved data sources" before they become vulnerabilities.

Start with high-privilege agents—those with database write access, API keys, or system-level permissions.

Priority 3: Add Rampart to Your CI/CD Pipeline

Select one AI agent and integrate Rampart into its deployment pipeline as a proof of concept. Run it on every pull request, not just before releases. Set failure thresholds: if Rampart detects privilege escalation or data exfiltration, the build fails.

Document findings during the first two weeks to uncover edge cases missed by manual testing.

Priority 4: Develop Runbooks for Safety Test Failures

Rampart will identify issues. Your team needs procedures for triaging them. Not every failed check is a critical vulnerability—some are accepted design limitations. Create a classification system: block deployment, fix before next sprint, or document as a known limitation.

Include escalation paths. If Rampart detects a new attack pattern, determine who investigates and who decides whether to proceed.

Priority 5: Incrementally Extend Coverage

After validating the workflow with one agent, expand to your next three highest-risk agents. Avoid instrumenting everything at once. Build expertise, refine test scenarios, and let your team adapt to the new workflow before making it universal.

AI security best practices

Topics:General

You Might Also Like