perf-profilinglisted
Install: claude install-skill nguyenthienthanh/aura-frog
> **AI-consumed reference.** Optimized for Claude to read during execution.
> Human-readable explanation: see [docs/architecture/HIERARCHICAL_PLANNING.md](../../../docs/architecture/HIERARCHICAL_PLANNING.md)
> or [docs/getting-started/](../../../docs/getting-started/) depending on topic.
# Performance Profiling
**Rule 0:** Don't optimize what you haven't measured. Almost all "obvious" optimizations are wrong.
---
## The Protocol
### Step 1 — Define "fast enough"
Performance is relative. Before optimizing:
- **Current metric** (p50, p95, p99 latency / MB memory / % CPU)
- **Target metric** (what's the SLO/SLA/user expectation?)
- **Gap** — what needs to close
If no target → **ask the user**. Don't optimize without a goal — you'll polish forever.
### Step 2 — Measure baseline
Use the appropriate profiler:
| Language | Profiler |
|----------|----------|
| Node.js | `node --prof`, `clinic.js`, `0x` |
| Browser JS | Chrome DevTools Performance tab |
| Python | `cProfile`, `py-spy`, `memory_profiler` |
| Go | `pprof` (built-in) |
| Rust | `cargo flamegraph`, `perf` |
| CLI | `hyperfine` |
| HTTP | `wrk`, `k6`, `vegeta` |
Run the profiler under **realistic load**, not trivial input. Capture for 30–60s minimum — short captures miss long-tail events.
### Step 3 — Analyze (flamegraph / top consumers)
Identify the top 3 functions by:
- **Self time** (function's own CPU, excluding descendants)
- **Total time** (self + descendants)
- **Call count** (high-frequency cold func