context-compressionlisted

REDUCING context size — summarization strategies, anchored iterative summarization, tokens-per-task optimization, compaction triggers, and probe-based evaluation. Use when the user asks to "compress context", "summarize conversation history", "implement compaction", "reduce token usage", or mentions structured summarization or long-running sessions exceeding context limits. NOT for diagnosing context failures or degradation patterns (use context-degradation), NOT for KV-cache optimization or context partitioning (use context-optimization), NOT for learning context theory or basics (use context-fundamentals), NOT for file-based offloading or scratch pads (use filesystem-context).
viktorbezdek/skillstack · ★ 9 · AI & Automation · score 76

Install: claude install-skill viktorbezdek/skillstack

# Context Compression Strategies When agent sessions generate millions of tokens of conversation history, compression becomes mandatory. The naive approach is aggressive compression to minimize tokens per request. The correct optimization target is tokens per task: total tokens consumed to complete a task, including re-fetching costs when compression loses critical information. ## When to Use / Not Use **Use when:** - Agent sessions exceed context window limits - Codebases exceed context windows (5M+ token systems) - Designing conversation summarization strategies - Debugging cases where agents "forget" what files they modified - Building evaluation frameworks for compression quality **Do NOT use when:** - Diagnosing why context is degrading -> use `context-degradation` - KV-cache optimization or context partitioning -> use `context-optimization` - Learning foundational context theory -> use `context-fundamentals` - File-based offloading or scratch pads -> use `filesystem-context` ## Decision Tree ``` Why do you need compression? ├── Agent sessions hitting context limits │ ├── What matters most? │ │ ├── File tracking + decision history (long sessions) -> Anchored Iterative Summarization │ │ ├── Maximum token savings (short sessions, low re-fetch cost) -> Opaque Compression │ │ └── Readability + phase boundaries -> Regenerative Full Summary │ └── Not sure? -> Start with Anchored Iterative (best quality trade-off) ├── Need to measure if compression is work