ASAPP’s Red Team Finds 50+ AI Flaws Daily with Continuous Testing

Introduction: The Need for Continuous Adversarial Testing

ASAPP, an enterprise AI provider, identified a significant gap in their security strategy: traditional pre-deployment testing couldn't keep up with the rapid evolution of their AI models. To address this, they developed Continuous Red Teaming—an automated adversarial testing system using the Promptfoo platform. This system tests for over 50 vulnerability types in real-time, aligning findings with the OWASP Top 10 for LLMs and the NIST AI Risk Management Framework. This approach offers a proactive solution to the limitations of static security testing for AI systems.

Transition to Continuous Testing

Before Implementation:

ASAPP conducted point-in-time security assessments during model development.
Models were deployed with security snapshots that quickly became outdated.
There was no systematic method to identify new vulnerabilities introduced by model updates or evolving adversarial prompts.

After Implementation:

Continuous testing is now conducted on production models.
Real-time vulnerability data is integrated into security dashboards.
Automated grading reduces the need for manual reviews.

The shift to continuous testing was driven by ASAPP's transition from passive AI systems (answering questions) to active ones (taking actions). When AI can modify records or trigger workflows, the impact of a successful prompt injection increases significantly.

Identifying Gaps in Traditional Controls

Lack of Continuous Validation: Testing AI models only during development leaves significant time gaps between security checks. Unlike traditional applications, AI systems can be manipulated through natural language, necessitating ongoing validation.

Inadequate Vulnerability Coverage: Traditional security tests, such as those for SQL injection, are insufficient for AI systems. New attack vectors like prompt injection, data poisoning, and model inversion require specialized checks.

Misalignment with AI-Specific Frameworks: Generic security standards do not address AI-specific risks. Testing against OWASP's web application list won't detect indirect prompt injections that could lead to data exfiltration.

Human Bias in Security Assessment: Manual red teaming can be inconsistent. Automated grading ensures repeatable and reliable measurements across test runs.

Aligning with Relevant Standards

OWASP Top 10 for LLMs highlights critical security risks for large language model applications, such as Prompt Injection and Training Data Poisoning. While ASAPP's testing aligns with these categories, the standard does not mandate continuous testing. Your team must establish policies to fill this gap.

NIST AI Risk Management Framework emphasizes ongoing risk assessment throughout the AI system lifecycle. Section 4.3 specifically calls for "continuous monitoring and periodic assessment."

ISO/IEC 27001:2022 Control 8.8 requires managing technical vulnerabilities. For AI systems, this involves tracking new attack vectors and testing model vulnerabilities. Annual penetration tests are insufficient.

Actionable Steps for Your Team

1. Map Your AI Attack Surface

Identify every AI system interacting with production data or performing actions. Document:

Input sources (user prompts, uploaded documents, API calls)
Actions the AI can take (database queries, API calls, email sends)
Data it can access (customer records, internal documents, credentials)

Prioritize high-risk systems that can modify data or trigger financial transactions.

2. Implement Automated Adversarial Testing

Use Promptfoo to test your models. Configure cases for:

Prompt injection attempts
Data exfiltration through output manipulation
Jailbreaking attempts
Model inversion

Integrate these tests into your CI/CD pipeline and run them against production models. ASAPP suggests daily testing as a baseline.

3. Grade Results Against Frameworks

Map test results to OWASP Top 10 for LLMs categories. Track trends and prioritize fixes using NIST AI RMF's risk rating approach.

4. Integrate Security into Model Updates

Every model update should trigger:

A full adversarial test suite
A comparison against the previous version's security posture
A go/no-go decision based on test results

Continuous testing allows you to address vulnerabilities before deployment.

5. Document Your Testing Methodology

Prepare for SOC 2 audits by documenting:

Testing frequency and triggers
Vulnerability types covered
Pass/fail criteria
Remediation workflow

ASAPP's alignment with OWASP and NIST frameworks provides a solid audit trail. Your approach should do the same.

By adopting continuous adversarial testing, your team can proactively secure AI systems as they transition from passive to active roles. Follow ASAPP's lead to stay ahead of potential breaches.

ASAPP's Red Team Finds 50+ Flaws Daily

Introduction: The Need for Continuous Adversarial Testing

Transition to Continuous Testing

Identifying Gaps in Traditional Controls

Aligning with Relevant Standards

Actionable Steps for Your Team

You Might Also Like

Gitea Header Injection: 13 Days to Exploitation

ColdFusion Exploit Caught in Two Hours

The Log4Shell Fix That Never Arrived