Skip to main content
One Unrotated Token: Grafana's GitHub BreachIncident
5 min readFor Security Engineers

One Unrotated Token: Grafana's GitHub Breach

What Happened

On May 1, 2025, Grafana detected unauthorized access to its private GitHub repositories. The attacker exploited a single GitHub workflow token that should have been rotated weeks earlier. This token remained valid after a broader supply-chain compromise involving TanStack npm packages, granting the attacker access to Grafana's source code and operational information. Although no customer data was exposed, the incident highlighted a gap in token lifecycle management that many organizations face.

The breach originated from a supply-chain attack against TanStack's npm packages. When Grafana responded to that initial compromise, the team rotated most tokens but missed one GitHub workflow token. This oversight created a vulnerability that lasted until May 1.

Timeline

  • Initial compromise: TanStack npm packages compromised (date not publicly disclosed)
  • Grafana's response: Token rotation initiated across the organization
  • Gap identified: One GitHub workflow token not included in rotation scope
  • May 1: Grafana detects malicious activity using the unrotated token
  • May 1 (same day): Incident response plan activated; token revoked
  • Post-incident: Source code and operational data confirmed accessed; customer data confirmed safe

Which Controls Failed or Were Missing

Token Inventory and Classification

Grafana did not have a complete inventory of all tokens in use across its CI/CD environment. When the team initiated emergency rotation following the TanStack compromise, they worked from an incomplete list. The GitHub workflow token was not documented in their token registry.

Automated Token Discovery

Manual token tracking was insufficient. In a complex CI/CD environment with multiple repositories, workflows, and integrations, human-maintained records can quickly become outdated. Grafana needed automated discovery to map every token with repository access.

Token Lifecycle Management

The unrotated token had no expiration policy. GitHub allows setting token expiration, but enforcement requires process discipline. Without automated expiration, tokens can remain valid indefinitely, posing a security risk.

Monitoring and Anomaly Detection

While the malicious activity was detected, the duration between the first unauthorized use and detection is unknown. Token usage monitoring should flag:

  • Access from unexpected IP ranges
  • Unusual access patterns (time of day, frequency)
  • Repository access that doesn't match the token's normal scope

What the Relevant Standards Require

ISO/IEC 27001:2022

Control 5.17 (Authentication information) mandates managing the allocation and lifecycle of authentication information, including tokens. This involves:

  • Maintaining an inventory of all authentication credentials
  • Defining and enforcing lifecycle policies (rotation, expiration)
  • Revoking credentials when no longer needed

Grafana's gap: The token inventory was incomplete, so lifecycle management couldn't apply to all tokens.

Control 8.9 (Configuration management) involves tracking system configurations, including CI/CD pipeline components. Your token registry should be treated as configuration data—versioned, reviewed, and audited.

NIST 800-53 Rev 5

IA-5 (Authenticator Management) requires managing authenticators (including tokens) throughout their lifecycle. Specific controls include:

  • IA-5(1): Password-based authentication (applies to token secrets)
  • IA-5(7): No embedded unencrypted static authenticators

The unrotated GitHub token violated the spirit of IA-5 by remaining valid indefinitely without review.

AU-6 (Audit Review, Analysis, and Reporting) requires reviewing logs for inappropriate or unusual activity. GitHub provides audit logs for token usage. If you're not ingesting these into your SIEM and alerting on anomalies, you're missing critical insights.

SOC 2 Type II (CC6.1)

The Common Criteria CC6.1 requires logical and physical access controls, including:

  • Restricting access to authorized users
  • Removing access when no longer needed
  • Reviewing access rights periodically

An unrotated token after a known supply-chain compromise represents a failure to remove access when the trust boundary changed. The TanStack incident should have triggered a comprehensive access review, not just a partial token rotation.

Lessons and Action Items for Your Team

Build a Complete Token Inventory

Start with automated discovery. For GitHub specifically:

  • Use the GitHub API to enumerate all personal access tokens, OAuth tokens, and GitHub App installations across your organization
  • Scan your repositories for GitHub Actions workflows that use GITHUB_TOKEN or custom secrets
  • Document which services and workflows use each token and why

Don't rely on tribal knowledge. If a developer leaves and takes their mental map of token usage with them, you're vulnerable.

Implement Automated Token Rotation

For GitHub workflow tokens:

  • Use GitHub Apps instead of personal access tokens when possible (better scoping, automatic expiration)
  • Set maximum lifetimes on all tokens (30-90 days depending on risk)
  • Build rotation into your CI/CD pipeline—treat token refresh as a deployment task, not a manual chore

For other CI/CD tokens (Jenkins, GitLab, CircleCI):

  • Use secret management tools (HashiCorp Vault, AWS Secrets Manager) that support automatic rotation
  • Implement just-in-time token provisioning where possible—generate short-lived tokens for specific jobs rather than maintaining long-lived credentials

Monitor Token Usage

Ingest GitHub audit logs into your SIEM and create alerts for:

  • Token usage from unexpected geographic locations
  • Access to repositories outside the token's normal scope
  • High-volume API calls that could indicate data exfiltration
  • Token usage during off-hours (if your team doesn't work 24/7)

Set a baseline for normal token behavior and alert on deviations. This won't prevent a breach, but it will shrink the detection window.

Define Supply-Chain Incident Response Procedures

When you discover a compromised dependency:

  1. Assume all credentials accessible to that dependency are compromised
  2. Rotate all tokens that could have been exposed—not just the obvious ones
  3. Review access logs for the window between compromise and detection
  4. Treat it as a full incident, not just a "dependency update"

Grafana deployed an incident response plan quickly, which limited the damage. But the plan should have included a comprehensive token audit as step one.

Test Your Token Rotation Process

Run a tabletop exercise: "We just discovered a supply-chain compromise. Walk me through every token we need to rotate in the next four hours."

If your team can't produce a complete list, your token inventory is incomplete. If rotation takes longer than a few hours, your process needs automation.

The Grafana breach happened because one token slipped through the cracks during emergency rotation. Don't wait for an emergency to discover your cracks.

Topics:Incident

You Might Also Like