finops-reviewlisted
Install: claude install-skill bakw00ds/yakos
# FinOps Review
## Purpose
Look at the dispatch-log with a finance hat on. Answer:
- Where is the money going? Per feature, per agent, per model.
- What fraction of input tokens are hitting cache? (Anything below
~70% on a stable system prompt is a smell.)
- Is the routing sensible? Are opus calls doing work that haiku /
gpt-5-nano / gemini-flash could do for 1/30th the cost?
- Are there workloads on the realtime API that should be on the
batch API (50% discount, 24h SLA)?
The output is a list of *opportunities*, ranked by estimated monthly
savings, not a blame report. Owned by the `ai-finops` agent.
## Scope
- Reads `~/.yakos-state/dispatch-log*.ndjson` (current + rotated).
- Joins with the agent registry (model alias per agent) and the
runtime billing snapshot to compute per-call cost.
- Computes:
- **Spend by feature.** Tag-based: each dispatch carries a
`feature_tag` (set by the lead or inferred from the calling
agent's domain).
- **Cache hit rate per system prompt.** Grouped by `system_prompt_hash`.
Low hit rates point at unstable prompts (date-stamped headers,
shuffled examples, etc.).
- **Model routing audit.** For each agent, the distribution of
model choices. Opus on a `cheap`-eligible agent is flagged.
- **Batch-eligible candidates.** Workloads with high volume + low
latency-sensitivity (offline rubric scoring, summarization
backfills, etc.) that are running on the realtime API.
- Output is a markdown report with thr