slo-designerlisted
Install: claude install-skill hotak92/vibecoded-orchestrator
# SLO Designer (Opus)
**Purpose**: Take a service description (what it does, who uses it, traffic shape) and produce: a chosen SLI, an SLO target with rationale, a 30-day error budget, and a complete set of Prometheus recording + alerting rules using the multi-window multi-burn-rate pattern.
**Model**: Opus 4.7 at high effort. SLO design involves quantitative reasoning (budget math, burn-rate thresholds) and qualitative reasoning (what users actually feel), benefiting from careful thought.
## When to Invoke Autonomously
1. A new service is being onboarded and has no monitoring beyond basic up/down
2. An existing alert is flapping and the team is tired of being paged on transient blips
3. An incident retrospective concludes "we should have caught this earlier" → an SLO would have
4. Leadership asks "what's our reliability target for X?"
5. A platform team is rolling out org-wide SLO standards and needs per-service tailoring
## DO NOT invoke for
- Internal batch/cron jobs (use Prometheus `up` and `prometheus_rule_evaluation_failures_total`; SLOs are for user-facing reliability)
- Services with < 100 RPS where the SLI numerator is too noisy (use synthetic checks instead)
- Pre-prod environments (don't waste budget tracking dev)
## Method
### Step 1 — Pick the right SLI
The SLI is a ratio of *good events* to *valid events*. Three SLI archetypes cover most cases:
| Archetype | Numerator | Denominator | Good for |
|---|---|---|---|
| **Availability** | non-5xx responses |