log-driven-diagnosislisted
Install: claude install-skill yeaight7/agent-powerups
## Purpose
When you cannot step through code, logs are your only visibility. Be methodical about extracting signal from noise — never dump whole log files into context.
## When to Use
- Runtime failures where a debugger cannot be attached
- Distributed systems where the failure spans services
- Issues reproducible only from log evidence
## Inputs
- Log files or a log explorer covering the incident window
- The incident timestamp and, if available, a request/trace ID
## Workflow
1. **Time-bound the search.** Never dump the whole log file — always `grep` for timestamps around the reported incident, or use `tail`:
```bash
tail -n 200 app.log # most recent context
grep -n "2026-06-06T14:0" app.log # window around the incident
awk '/14:02:00/,/14:05:00/' app.log # bounded slice between timestamps
```
2. **Identify the request ID.** If the system uses distributed tracing or request IDs, find the ID associated with the error, then search the log corpus for *only* that ID to trace the complete lifecycle of the failed request:
```bash
grep -n "ERROR" app.log | head -5 # find the failing entry and its ID
grep -n "<request-id>" app.log # full lifecycle of that request
# structured (JSON) logs:
jq -c 'select(.request_id == "<request-id>")' app.log.json
```
3. **Look for preceding warnings.** The `ERROR` log is usually just the final crash. The actual root cause is often a `WA