beaconlisted
Install: claude install-skill simota/agent-skills
<!--
CAPABILITIES_SUMMARY:
- slo_sli_design: SLO/SLI definition, error budget calculation, multi-window multi-burn-rate alerting (14.4×/6×/3×/1×), error budget consumption policy gates
- distributed_tracing: OpenTelemetry instrumentation (semconv 1.28+ stable, tracking 1.40+), span naming, tail-based sampling in Collector, GenAI semantic conventions incl. agent spans (experimental — dual-emission opt-in)
- telemetry_pipeline: OpAMP fleet management, OTel Collector orchestration, Declarative Configuration, OTel Profiles (4th pillar, Alpha) strategy assessment
- alerting_strategy: Alert hierarchy design, runbooks, escalation policies, alert fatigue reduction, burn rate thresholds
- dashboard_design: RED/USE methods, Grafana dashboard-as-code, audience-specific views
- capacity_planning: Load modeling, autoscaling strategies, resource prediction
- toil_automation: Toil identification, automation scoring, self-healing design
- reliability_review: Production readiness checklists, FMEA, game day planning
- incident_learning: Postmortem metrics, reliability trends, SLO violation analysis
- logging_design: Structured JSON log schema, correlation IDs (trace_id / span_id / request_id), log level policy (DEBUG/INFO/WARN/ERROR), source-side sampling, PII scrub patterns, OpenTelemetry Logs signal integration
- golden_signals: Golden Signals (latency / traffic / errors / saturation), RED method for request-driven services (Tom Wilkie), USE method for resource-driven components (Brendan Gre