monitoring-observabilitylisted
Install: claude install-skill komluk/scaffolding
# Monitoring & Observability Skill
## Purpose
Standards for monitoring, metrics, alerting, and observability.
## Auto-Invoke Triggers
- Setting up monitoring infrastructure
- Defining metrics and KPIs
- Configuring alerts
- Implementing distributed tracing
---
## Three Pillars of Observability
| Pillar | Purpose | Tools |
|--------|---------|-------|
| **Logs** | Event records | ELK, Loki, CloudWatch |
| **Metrics** | Numerical measurements | Prometheus, Datadog |
| **Traces** | Request flow | Jaeger, Zipkin, X-Ray |
---
## Key Metrics (Golden Signals)
### The Four Golden Signals
| Signal | Description | Example Metric |
|--------|-------------|----------------|
| **Latency** | Response time | p50, p95, p99 latency |
| **Traffic** | Request volume | Requests per second |
| **Errors** | Failure rate | Error percentage |
| **Saturation** | Resource usage | CPU, memory utilization |
### RED Method (Request-focused)
- **R**ate - Requests per second
- **E**rrors - Failed requests per second
- **D**uration - Request latency
### USE Method (Resource-focused)
- **U**tilization - Resource % used
- **S**aturation - Queue depth
- **E**rrors - Error count
---
## Application Metrics
### HTTP Endpoints
| Metric | Type | Description |
|--------|------|-------------|
| `http_requests_total` | Counter | Total requests |
| `http_request_duration_seconds` | Histogram | Request latency |
| `http_requests_in_flight` | Gauge | Active requests |
| `http_response_size_bytes` | Histogram