← ClaudeAtlas

ci-failure-triagelisted

Triage a CI / PR check failure by READING the failure body before forming any hypothesis. Triggered whenever a required check is red, a PR is BLOCKED, a merge won't land, or you're about to call a failure "transient", "flaky", "stale", or "orphaned". Prevents dismissing a real failure (e.g. real CodeQL security alerts) as noise.
IgorGanapolsky/ThumbGate · ★ 23 · AI & Automation · score 74
Install: claude install-skill IgorGanapolsky/ThumbGate
# CI Failure Triage — Read Before You Conclude ## The failure this prevents Calling a red check "transient / flaky / orphaned / stale" WITHOUT reading its body. This session: a CodeQL check failed; it was dismissed as a "transient 4s orphaned check-run" — twice — when it was reporting **3 real security vulnerabilities** (2 critical command-injection + 1 high XSS). The conclusion came before the evidence. (2026 failure-triage practice = taxonomy → read → cluster → gate, a *repeatable detection system*, not vibes: https://latitude.so/blog/ai-agent-failure-modes-detection-playbook) ## Hard rule **You may not use the words "transient", "flaky", "stale", "orphaned", or "unrelated" about a check until you have read its failure body and quoted the actual error.** A duration (e.g. "4s") is a hint, never proof. ## Protocol (in order — do not skip) 1. **Identify the exact failing check + its commit.** ```bash head=$(gh pr view <N> --json headRefOid -q .headRefOid) gh pr checks <N> | grep -viP "\t(pass|skipping)\t" ``` 2. **Read the failure body. This is the step that gets skipped.** - GitHub Actions job: `gh run view --job <id> --log-failed | tail -40` - CodeQL / code-scanning: read the ALERTS, not just the check: ```bash gh api "repos/<owner>/<repo>/code-scanning/alerts?state=open&per_page=100" \ --jq '.[] | "\(.rule.id) | \(.rule.security_severity_level) | \(.most_recent_instance.location.path):\(.most_recent_instance.location.start_line) | r