← ClaudeAtlas

credit-stall-mid-orchestration-revive-collisionlisted

Recover gracefully when an Anthropic credit/billing failure stalls multiple in-flight parallel subagents mid-orchestration, then later resolves. Use when: (1) you've dispatched 2+ parallel subagents (Task tool with `run_in_background: true`) and a credit/billing issue, MCP outage, or other transient harness failure has frozen them, (2) the user reports the issue is now resolved and asks you to continue, (3) you're about to relaunch "resume" agents to pick up where stalled originals left off. Defends against the silent-revive trap: when API access is restored, ORIGINAL stalled subagents can auto-resume from where they froze and continue working IN PARALLEL with any "resume" agents you dispatch — leading to two agents racing on the same branch / PR / files, duplicate review comments, or one agent overwriting the other's work. Captures the diagnostic recipe (JSONL mtime + worktree-state inspection, NOT just agent status) and the safe resume pattern (state-aware briefs that detect current state and pivot to value
wan-huiyan/agent-traffic-control · ★ 2 · AI & Automation · score 79
Install: claude install-skill wan-huiyan/agent-traffic-control
# Credit-stall mid-orchestration: revive-collision recovery ## Problem You're orchestrating N parallel subagents (Task tool with `run_in_background: true`), each working on independent tracks (e.g., one PR per track). Mid-run, an Anthropic billing/credit failure or other transient harness outage hits — all in-flight subagents freeze simultaneously. The user fixes billing, reports back, and asks you to continue. **The trap:** stalled subagents are NOT dead — they're suspended at the harness layer. When API access is restored, the harness can auto-resume them from the exact tool-call boundary they froze on. If you naively relaunch "resume" agents to pick up where you think the originals stopped, you end up with TWO agents racing on: - The same git branch (one commits + pushes, the other rebases on top of stale state) - The same PR (duplicate review comments, conflicting label changes) - The same files (one Edit lands, then the other Edit overwrites) Symptoms when this happens: PR has two "no issues found" review comments, force-push storms as both agents rebase, agent A reports "PR already merged" while agent B is still polling CI for that same PR. **Severity ladder** (observed in an earlier session, 2026-05-08, all three originals revived ~2hr post-stall): 1. **Best case (Track A original)**: revives mid-implementation, notices the merged PR via subsequent harness signals, sends a status update, terminates. Resume agent had pivoted to value-add (PR #571) under a s