observabilitylisted
Install: claude install-skill kreek/consult
# Observability
## Iron Law
`NO USER-REACHABLE SERVICE PATH SHIPS BLIND.`
## When to Use
- Logs, metrics, traces, health checks, dashboards, SLOs, alerts,
dependency health, incident diagnosis, OpenTelemetry, RED/USE,
cardinality, exemplars, or burn-rate alerts.
## When NOT to Use
- Local-only scripts or libraries with no operational surface.
- Error type design; use `error-handling`.
- Release sequencing; use `release`.
## Core Ideas
1. Instrument behavior customers depend on, not just process internals.
2. Logs are structured events with stable names, typed fields,
severity, outcome, and trace/correlation IDs. JSON alone is not
enough.
3. Use OpenTelemetry semantic conventions where they exist before
inventing custom field names.
4. Metrics need bounded labels; cardinality is a production cost and
reliability risk.
5. Traces show cross-boundary causality; logs explain decisions.
6. Critical dependencies expose latency, error, timeout, retry,
circuit-breaker state, and saturation signals.
7. Dashboards answer current health and likely fault location. Alerts
are SLO-backed, actionable, and tied to runbooks.
8. Health checks separate liveness from readiness.
9. Sensitive data is redacted at source; collector filtering is
defense in depth.
## Workflow
1. Identify the user-facing path, dependency, queue, or resource being
observed. Choose RED for request paths, USE for resources.
2. Add structured logs, metrics, and spans per project convent