Bucket Squatting in Vertex AI: Risks of Predictable Names

What Happened

Google's Vertex AI SDK for Python versions 1.139.0 and 1.140.0 had a design flaw allowing attackers to execute arbitrary code. The issue arose from predictable cloud storage bucket naming: the SDK used only a customer's project ID and region, which are often easy to discover or guess.

Attackers could predict these bucket names, create the buckets before legitimate users, and upload malicious model files serialized with Python's pickle format. When a victim's application loaded what it thought was a legitimate AI model, it executed the attacker's code instead.

Unit 42 discovered this vulnerability using an LLM-augmented code analysis workflow. Google addressed the issue in SDK versions 1.144.0 and 1.148.0 by adding bucket ownership validation.

Timeline

Discovery phase: Unit 42 researchers used an LLM in their vulnerability discovery workflow to identify the predictable naming pattern in the Vertex AI SDK source code.

Exploitation window: Versions 1.139.0 and 1.140.0 were vulnerable. Organizations using these versions during this period faced potential compromise.

Remediation: Google released fixed versions 1.144.0 and 1.148.0, which validate bucket ownership before loading model artifacts.

Current state: Organizations must ensure they've upgraded beyond version 1.140.0 and audit any models loaded during the vulnerable period.

Which Controls Failed or Were Missing

Input validation: The SDK did not verify that a cloud storage bucket belonged to the expected project before loading serialized data. Always validate the source of executable content.

Namespace protection: Using predictable inputs (project ID + region) without additional entropy led to a namespace collision vulnerability. Attackers could claim these names in advance.

Deserialization security: Using pickle to deserialize untrusted data is risky. Pickle can execute arbitrary code during deserialization. The SDK lacked a mechanism to verify that model artifacts came from a trusted source before deserializing them.

Dependency pinning: Organizations using vulnerable SDK versions likely lacked automated dependency update processes. The gap between vulnerable versions (1.139.0-1.140.0) and patched versions (1.144.0+) suggests teams weren't monitoring SDK releases.

What the Standards Require

OWASP ASVS v4.0.3 Requirement 5.5.3 states: "Verify that deserialization of untrusted data is avoided or is protected in both custom code and third-party libraries." The SDK violated this by deserializing pickle objects from unverified cloud storage buckets.

NIST 800-53 Rev 5 control SI-10 (Information Input Validation) requires applications to check the validity of information inputs, including data sources. The SDK's failure to validate bucket ownership before loading model artifacts directly violated this control.

ISO 27001 Annex A.8.24 addresses cryptography, but more relevant here is A.8.31 on the separation of development, test, and production environments. Organizations using Vertex AI without verifying model artifact provenance failed to maintain proper separation.

PCI DSS v4.0.1 Requirement 6.2.4 mandates that software components are protected from known vulnerabilities. Once Google released patched SDK versions, any organization processing payment data and still running vulnerable versions violated this requirement.

Lessons and Action Items for Your Team

Audit your current Vertex AI SDK version immediately. Run pip show google-cloud-aiplatform and ensure you're on version 1.144.0 or later. If you're on 1.139.0-1.140.0, upgrade and investigate whether any models were loaded during the exposure window.

Implement dependency update monitoring. Set up automated alerts for security updates to your Python dependencies using tools like Dependabot, Snyk, or pip-audit.

Never deserialize untrusted data with pickle. Use safer serialization formats like ONNX, TensorFlow SavedModel, or PyTorch's torch.save with weights_only=True. If you must use pickle, implement signature verification: cryptographically sign your model files and validate signatures before deserialization.

Add resource ownership validation to your cloud workflows. Verify that resources (buckets, queues, databases) belong to your project or account. Google's fix validates bucket ownership—apply this pattern across your cloud resource access.

Review your naming conventions for collision resistance. Use patterns like {project-id}-{region}-{random-uuid} to make resource names harder to predict. This is important for any resource where namespace squatting could be exploited.

Test your LLM-assisted security tools. If you're using AI for code review and vulnerability discovery, validate their findings. LLMs can miss context or generate false positives.

Document your model provenance. Implement a model registry that tracks model lineage, training data sources, and deployment history. When a vulnerability surfaces, you need to quickly determine which models might be compromised.

The bucket squatting vulnerability in Vertex AI combined predictable naming, missing ownership validation, and unsafe deserialization. Your infrastructure may have similar patterns waiting to be exploited. Identify and address them proactively.

Python's pickle module security

Bucket Squatting in Vertex AI: RCE Through Predictable Names

What Happened

Timeline

Which Controls Failed or Were Missing

What the Standards Require

Lessons and Action Items for Your Team

You Might Also Like

100K Sites Hit: Gravity SMTP REST API Flaw

Broken Access Control: The 2021 OWASP Shift No One Expected

REDoS in UAParser.js: A 72-Hour Fix