distributed-logginglisted

This skill should be used when the user designs "distributed logging", "log aggregation", "centralized logs", an "ELK" or "EFK" stack, "log shipping", "structured logging", a "correlation ID" or "trace ID" in logs, "log retention", or "high-volume log ingest". It gives the collect → buffer → ship → index → store → retain pipeline, sampling, ordering, and cold-storage tiering. Use it whenever many services emit logs that must be searched in one place under load, even if the user doesn't say "logging pipeline".
proyecto26/system-design-skills · ★ 6 · Data & Documents · score 76

Install: claude install-skill proyecto26/system-design-skills

# Distributed logging Move logs from thousands of processes into one searchable place, fast enough to debug a live incident and cheap enough to keep for months. Getting it wrong is a classic "ignore failure" miss: the logging pipeline is itself a distributed system that buckles under the exact traffic spike you most need it during, and a naive design either drops the evidence or takes down the app it instruments. ## When to reach for this More than one process emits logs and someone needs to search them together; an incident requires correlating a request across services; log volume has outgrown `grep` on a box; or compliance demands retention. The pipeline buys central search, cross-service correlation, and a durable record decoupled from any single host. ## When NOT to A single service on one host where `journald` + log rotation is enough — a full pipeline is pure operational overhead (YAGNI). Numeric time-series questions ("what is p99 latency", "is error rate up") belong to metrics, not log scans — that is `observability`'s job; logs answer "what exactly happened to *this* request". Don't ship every debug line at full volume before a number shows the volume justifies the cost; sample first. ## Clarify first - **Volume and peak** — lines/sec and bytes/sec, average and peak (→ `back-of-the-envelope`). This sizes every stage. - **Structured or free-text** — can producers emit JSON now, or is there legacy text to parse? - **Query latency need** — interactive search in sec