Elementary-Data Supply Chain Attack: Stealing Cloud Credentials Fast

What Happened

On April 24, 2025, attackers published version 0.23.3 of the Python package elementary-data to PyPI. This malicious release contained code designed to exfiltrate cloud credentials from any environment where it was installed. The package was available for about 8-10 hours—from 22:20 UTC on April 24 until removal between 8:51 and 11:51 UTC on April 25.

The attack exploited a script injection flaw in the package maintainer's GitHub Actions workflow. An attacker crafted a pull request comment that exploited unsafe handling of user input in the CI/CD pipeline, allowing them to inject malicious code into the package build process. Any data engineer who ran pip install elementary-data==0.23.3 during that window potentially exposed AWS credentials, GCP service account keys, Azure tokens, and database connection strings to the attacker.

Timeline

April 24, 22:20 UTC: Malicious version 0.23.3 published to PyPI
April 24-25 (8-10 hour window): Package available for installation; credential theft active
April 25, 8:51-11:51 UTC: Package removed from PyPI

The narrow detection window is significant. Organizations with automated dependency updates or developers installing packages during off-hours may not have noticed the compromise until credentials were already exfiltrated.

Which Controls Failed or Were Missing

Unsafe GitHub Actions Workflow Configuration

The root cause was a workflow that processed untrusted input—specifically, pull request comments—without proper sanitization. GitHub Actions workflows can execute arbitrary code when expressions like ${{ github.event.comment.body }} are used in run: steps. This is a well-documented anti-pattern, yet it remains common in open-source repositories.

No Package Signing or Verification

The compromised package was published without cryptographic signatures that would allow consumers to verify authenticity. While PyPI supports package signing, it's not enforced, and most Python toolchains don't validate signatures by default.

Missing Dependency Pinning and Lock Files

Organizations that installed the package likely did so with unpinned version specifications (elementary-data>=0.23.0 rather than elementary-data==0.23.2). This meant automated systems pulled the malicious version immediately upon publication.

Inadequate Secret Rotation Procedures

The 8-10 hour exposure window was sufficient for credential theft, but the real damage depends on how long those credentials remained valid. Organizations without automated secret rotation or anomalous access detection faced extended risk.

Lack of Runtime Integrity Monitoring

The malicious code executed during package installation. Organizations without file integrity monitoring or runtime application self-protection (RASP) had no visibility into the exfiltration activity until after the fact.

What the Relevant Standards Require

PCI DSS v4.0.1 Requirement 6.3.2

"Security of bespoke and custom software and software components is managed throughout the software development lifecycle." This includes supply chain security. Organizations processing payment data must validate that third-party code components don't introduce vulnerabilities—including malicious packages in their dependency chains.

Gap: Most organizations don't treat open-source dependencies as "third-party code" requiring security validation, despite the identical risk profile.

NIST 800-53 Rev 5: SA-12 (Supply Chain Protection)

"The organization protects against supply chain threats by employing security safeguards as part of a comprehensive, defense-in-depth information security strategy." Specific controls include:

SA-12(5): Limitation of harm from malicious code
SA-12(9): Operations security for supply chain elements
SA-12(11): Penetration testing and analysis of supply chain elements

Gap: Few organizations perform ongoing security analysis of their transitive dependencies, especially for data engineering tools that may not be considered "critical" infrastructure.

ISO 27001:2022 Control 8.30 (Outsourced Development)

"Information security requirements for outsourced development shall be defined and agreed with the supplier." While this control typically applies to contracted development, the principle extends to open-source dependencies—they're code you're running but didn't write.

Gap: Dependency installation is rarely governed by the same security requirements as custom code or contracted development.

Lessons and Action Items for Your Team

Immediate: Audit for Exposure

If you use elementary-data or any package in the Python data engineering ecosystem, check your installation logs for April 24-25. Look for version 0.23.3 specifically:

grep "elementary-data==0.23.3" /var/log/pip.log

If you find it, assume credential compromise. Rotate all cloud credentials accessible from affected environments immediately—not just the credentials you think were exposed, but everything in that scope.

Short-term: Harden GitHub Actions Workflows

If you maintain packages or use GitHub Actions for CI/CD:

Never use expressions in run commands with untrusted input. Replace ${{ github.event.comment.body }} with environment variables that GitHub sanitizes automatically.
Enable workflow approval requirements for first-time contributors. GitHub's settings allow you to require manual approval before workflows run on pull requests from new users.
Use OpenID Connect (OIDC) tokens instead of long-lived credentials in workflows. GitHub's OIDC integration with AWS, GCP, and Azure provides short-lived tokens scoped to specific workflows.

Medium-term: Implement Dependency Integrity Controls

Pin all dependencies with hash verification. Use lock files (requirements.txt with hashes, poetry.lock, Pipfile.lock) and verify them in CI:

pip install --require-hashes -r requirements.txt

Deploy a private package mirror or proxy (Artifactory, Nexus, or AWS CodeArtifact). Configure it to:
- Cache packages after manual security review
- Block new versions until approved
- Log all package installations for forensic capability
Run dependency scanning in CI with tools that check for known malicious packages (not just CVEs). Socket Security, Phylum, and Checkmarx Supply Chain Security specifically detect malicious behavior patterns.

Long-term: Build Defense in Depth

Implement least-privilege credential scoping. The credentials stolen in this attack were likely over-provisioned. Use IAM role boundaries and resource tags to limit blast radius.
Deploy runtime monitoring that detects unexpected network connections during package installation. Tools like Falco or osquery can alert on outbound connections from pip or Python setup scripts.
Establish automated credential rotation with maximum lifetimes of 24-48 hours for programmatic access. AWS Secrets Manager, Azure Key Vault, and GCP Secret Manager all support this.
Create an incident response playbook specifically for supply chain compromises. Your existing breach procedures likely assume direct intrusion, not malicious code in trusted packages. The response timeline and evidence preservation steps are different.

The elementary-data attack succeeded because it exploited the trust relationship between developers and their tools. You can't eliminate that trust, but you can verify it continuously. Treat every package installation as a potential security event, and build controls that assume compromise rather than hoping for package maintainer vigilance.

8 Hours to Steal Cloud Credentials: The elementary-data Supply Chain Attack