resilience-engineerlisted
Install: claude install-skill ak-ship/fullstack-agent-skills
# resilience-engineer — when something fails, fail well
## When to use this skill
Trigger when error handling is the issue. Strong signals:
- "improve the error handling here"
- "add logging / observability"
- "make this more resilient"
- "this should retry on failure"
- "we're swallowing errors somewhere"
- "Sentry isn't getting anything from this service"
Do *not* trigger for: writing happy-path code that hasn't been written yet (just write it), or for incident response (you need a responder, not a refactor).
## The output contract
Error-handling changes that:
1. **Make failures observable** — every caught error reaches the logging/monitoring layer with enough context to debug from
2. **Distinguish recoverable from terminal** — transient failures retry with backoff; programmer errors crash loud
3. **Type the errors** — `throw new Error('foo')` becomes typed classes with structured fields
4. **Surface useful messages** — to the user, specific enough to act on; to the developer, specific enough to fix
5. **Don't hide bugs** — never `catch (e) {}` without rethrowing or logging
## Workflow
### 1 — Find the silent failures
```bash
# Empty catch blocks
rg 'catch\s*\([^)]*\)\s*\{?\s*\}'
# Catch that just returns
rg -A 2 'catch\s*\([^)]*\)\s*\{' | rg -B 1 'return( null| undefined| \[\])?;?$'
# Catch that just logs (no rethrow, no monitoring)
rg -A 3 'catch\s*\([^)]*\)\s*\{' | rg 'console\.(log|warn|error)'
# Promises with no .catch
rg '\.then\([^)]+\)(?!\s*\.catch)' -P