gza-task-debuglisted
Install: claude install-skill mhawthorne/gza
# Gza Task Debug
Diagnose why a gza task failed by analyzing logs, detecting agent loops, comparing against baselines, and providing actionable recommendations.
## Process
### Step 1: Get task ID
The user should provide a full prefixed task ID (for example, `gza-1234`). Extract it from the input.
### Step 2: Query task from database
Run a Python one-liner to get all task details as JSON:
```bash
uv run python -c "from gza.db import get_task; import json; print(json.dumps(get_task(<ID>), indent=2, default=str))"
```
Note the following fields for analysis:
- `status` — should be `failed` or `max_turns` (or possibly `completed` if user suspects partial failure)
- `num_turns` — number of agent turns used
- `duration_seconds` — total wall-clock time
- `cost_usd` — API cost
- `log_file` — provider conversation transcript path; also inspect the sibling `<stem>.ops.jsonl` file for runner lifecycle, preflight, command, outcome, and stats events
- `report_file` — path to the report (if any)
- `branch` — git branch the task worked on
### Step 3: Baseline comparison
Compare the failed task's metrics against the last 20 completed tasks:
```bash
uv run python -c "from gza.db import get_baseline_stats; import json; print(json.dumps(get_baseline_stats(20)))"
```
Calculate how far the failed task deviates:
- If `num_turns` is 2x+ the average → flag as high turns
- If `cost_usd` is 3x+ the average → flag as high cost
- Report the ratio (e.g., "3.2x more turns than average completed