Skip to main content
28.65 Million Secrets Leaked to Public GitHub in 2025Incident
5 min readFor Security Engineers

28.65 Million Secrets Leaked to Public GitHub in 2025

What Happened

GitGuardian's State of Secrets Sprawl 2026 reported 28.65 million new hardcoded secrets committed to public GitHub repositories during 2025. These included not only API keys from legacy systems but also new credential types introduced by AI development workflows. These credentials spanned model training environments to agent orchestration platforms, including database passwords, cloud access keys, and tokens for AI services that many organizations had not yet integrated into their secrets management programs.

This incident is not a single breach but a continuous exposure event affecting thousands of organizations, with credentials remaining valid long after their initial publication.

Timeline

Throughout 2025: Developers committed secrets to public repositories at an increasing rate as AI tools expanded the number of services requiring authentication. Internal collaboration platforms and repositories became secondary exposure points as teams copied code snippets and configuration files between environments.

Post-exposure: Many leaked credentials remained active for weeks or months. Organizations lacked automated detection for the new credential types introduced by AI services, and manual review processes couldn't keep up with the volume of potential exposures.

Ongoing: As of publication, a significant percentage of these 28.65 million secrets remain valid and exploitable.

Which Controls Failed or Were Missing

Pre-Commit Scanning

Most organizations scan public repositories for secrets, but gaps emerged around:

  • New AI service credential formats that existing regex patterns didn't catch
  • MCP (Model Context Protocol) tokens and similar emerging standards
  • Ephemeral credentials that developers assumed were short-lived but weren't

Your pre-commit hooks likely check for AWS keys and GitHub tokens, but they probably don't recognize credentials for the vector database you deployed last quarter or the LLM gateway your team set up for experimentation.

Internal Repository Monitoring

The shift to internal exposure represents a control blind spot. Teams assume private repositories are safe, leading them to:

  • Disable or relax secret scanning on internal repos
  • Allow broader credential sharing in Slack, Confluence, and internal wikis
  • Skip rotation requirements for "internal-only" credentials

This assumption fails when:

  • An internal repository becomes public (intentionally or through misconfiguration)
  • A contractor or departing employee retains access
  • An attacker gains initial access to your collaboration tools

Credential Lifecycle Management

The persistence of valid leaked credentials indicates failures in:

  • Rotation frequency: Credentials created months or years ago remain active
  • Automated revocation: No triggers exist to invalidate credentials when exposure is detected
  • Inventory completeness: Security teams can't rotate what they don't know exists

When a developer creates a service account for an AI training job, does that credential get added to your secrets inventory? Can you revoke it without asking the developer which service it's for?

What the Relevant Standards Require

PCI DSS v4.0.1

Requirement 8.3.2 mandates that passwords/passphrases meet minimum strength requirements and are changed if suspected of compromise.

Requirement 8.6.3 requires that application and system accounts and related credentials are protected against misuse through specific security measures.

If your environment processes payment data, every hardcoded secret in a repository represents a potential violation. The standard doesn't provide exceptions for "internal" repositories or "temporary" credentials.

ISO/IEC 27001:2022

Control 5.15 (Access control) requires that access to information and other associated assets is restricted in accordance with established policy.

Control 5.17 (Authentication information) mandates that allocation and management of authentication information is controlled through a management process, including advising personnel to protect authentication information appropriately.

Hardcoded credentials bypass your access control policy entirely. They create persistent, unauditable access paths that your identity management system doesn't know about.

NIST 800-53 Rev 5

IA-5 (Authenticator Management) requires organizations to manage system authenticators by establishing minimum requirements, changing default authenticators, and protecting authenticator content.

SC-12 (Cryptographic Key Establishment and Management) mandates that cryptographic keys are established and managed using automated mechanisms with supporting procedures.

Manual processes cannot scale to 28.65 million potential exposures. The control framework assumes automation.

Lessons and Action Items for Your Team

Expand Your Scanning Coverage

  1. Audit your regex patterns against the services your teams actually use. Add patterns for:

    • Vector database credentials (Pinecone, Weaviate, Qdrant)
    • LLM API keys (OpenAI, Anthropic, Cohere, local model endpoints)
    • AI infrastructure platforms (Weights & Biases, MLflow, experiment tracking)
  2. Enable scanning on internal repositories. Set the same detection rules you use for public repos. The exposure risk differs only in timing, not impact.

  3. Scan beyond git. Extend detection to:

    • Slack message history
    • Confluence pages and comments
    • Jira ticket descriptions
    • Internal wikis and documentation platforms

Implement Automated Credential Lifecycle Controls

  1. Set maximum credential age policies. For service accounts used in development:

    • 90 days for production service accounts
    • 30 days for development/staging
    • 7 days for experimental or proof-of-concept work
  2. Build automated rotation. Your secrets manager should handle rotation without developer intervention for:

    • Database credentials
    • API keys for third-party services
    • Service-to-service authentication tokens
  3. Create exposure response playbooks. When your scanning tool flags a leaked secret:

    • Automated revocation within 15 minutes (not manual ticket creation)
    • Temporary replacement credential issued to the owning team
    • Rotation of the permanent credential within 24 hours

Address the AI Tooling Gap

  1. Inventory your AI services. List every platform that requires credentials:

    • Model training environments
    • Inference endpoints
    • Agent frameworks and orchestration tools
    • Data pipeline components
  2. Standardize credential provisioning. Developers should request AI service access through your existing IAM workflow, not create their own API keys. If they can't, your IAM workflow has a gap.

  3. Monitor credential creation. Alert when new service accounts appear outside your provisioning process. This catches shadow IT and unauthorized experimentation before credentials leak.

Fix the Internal Repository Problem

The distinction between "public" and "internal" repositories is a policy boundary, not a security control. Treat all repositories as potentially public:

  • Same scanning rules
  • Same rotation requirements
  • Same incident response procedures

Your compliance auditor won't accept "but it was internal" as justification for hardcoded production database passwords.

The 28.65 million secrets leaked in 2025 represent a failure of assumptions: that developers would catch secrets before commit, that internal repositories were safe, that new AI services would fit existing detection patterns. Your controls need to work without those assumptions.

GitHub Security

Topics:Incident

You Might Also Like