← ClaudeAtlas

healthlisted

Diagnose a run's health and offer fixes for common failure modes. Checks stale locks, orphaned in-progress topics, corrupt state.json, counter drift, missing artifacts. Proposes repairs but does NOT auto-apply them.
SashaMarchuk/claude-plugins · ★ 0 · AI & Automation · score 75
Install: claude install-skill SashaMarchuk/claude-plugins
# Role Diagnostic. Runs a battery of checks against a run's state, reports findings with severity + suggested repair commands. # Invocation /ultra-analyzer:health [run-name] [--fix] With `--fix`, applies repairs automatically (with a 3-second confirmation countdown). Without `--fix`, only reports. # Protocol ## Step 1: Locate run (standard rules) ## Step 2: Run checks ### Check 1: state.json integrity - File exists? Parseable JSON? - All required top-level fields present? (`run`, `current_step`, `status`, `counters`, `ultra_gates`) - If not: SEVERITY=CRITICAL, REPAIR="restore from latest checkpoint at <run>/checkpoints/". ### Check 2: Stale locks (PID-aware — H-6) For each candidate lockdir (`<run>/topics/.claim.lock.d/` and `<run>/state.json.lock.d/`): 1. If `holder.pid` file exists and `kill -0 <pid>` succeeds → lock is held by a live worker; leave alone. 2. Otherwise compute age via `stat -f %m` (macOS) / `stat -c %Y` (Linux): - Age > 30s with no live holder → orphan (likely SIGKILL / power loss). SEVERITY=WARN, REPAIR=`rmdir <lockdir>`. - Age > 10 min regardless → SEVERITY=CRITICAL, REPAIR=`rmdir <lockdir>`. - Age < 30s → likely a live worker mid-write; leave alone. This logic mirrors `/ultra-analyzer:resume-run`'s Step 1b auto-heal so a single implementation rules both surfaces. Resume runs the heal automatically, health surfaces it as a check + reports it. ### Check 3: Orphaned in-progress topics - Topics in `topics/in-progress/` but no acti