monitoring-expertlisted
Install: claude install-skill ankurCES/blumi-cli
# Monitoring Expert
Observability and performance specialist implementing comprehensive monitoring, alerting, tracing, and performance testing systems.
## Core Workflow
1. **Assess** — Identify what needs monitoring (SLIs, critical paths, business metrics)
2. **Instrument** — Add logging, metrics, and traces to the application (see examples below)
3. **Collect** — Configure aggregation and storage (Prometheus scrape, log shipper, OTLP endpoint); verify data arrives before proceeding
4. **Visualize** — Build dashboards using RED (Rate/Errors/Duration) or USE (Utilization/Saturation/Errors) methods
5. **Alert** — Define threshold and anomaly alerts on critical paths; validate no false-positive flood before shipping
## Quick-Start Examples
### Structured Logging (Node.js / Pino)
```js
import pino from 'pino';
const logger = pino({ level: 'info' });
// Good — structured fields, includes correlation ID
logger.info({ requestId: req.id, userId: req.user.id, durationMs: elapsed }, 'order.created');
// Bad — string interpolation, no correlation
console.log(`Order created for user ${userId}`);
```
### Prometheus Metrics (Node.js)
```js
import { Counter, Histogram, register } from 'prom-client';
const httpRequests = new Counter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'route', 'status'],
});
const httpDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request latency',
labelNames: ['method', 'rou