design-memory-evallisted
Install: claude install-skill Sisuthros/claude-amplifier
# Design a Memory Eval
The trust-critical paths — Pattern Oracle risk scoring, stale-memory detection,
verification-gated promotion, write-verification — are exactly the places where
a silent regression does the most damage (a wrong risk score, a missed stale
day, a hallucinated-success that slips through). Changes here must be proven by
a deterministic test, not described in a commit message.
## When to use
You are adding or modifying any of:
- the **Pattern Oracle** (pre-task risk scan / scoring),
- **stale-memory detection** (`amplify_audit_freshness`, promote-from-memory),
- **verification-gated memory** (claim → evidence → confirmed, 5× weighting),
- **write-verification** (read-back, `AmplifierWriteError`).
## Procedure
1. **Write the failing scenario first.** Before the implementation, add a test
that encodes the behavior you want and currently fails (red). This proves the
test actually exercises the new behavior rather than passing vacuously.
2. **Use deterministic fixture data.** No wall-clock, no randomness, no network.
Seed an in-memory or temp SQLite store with fixed rows; pin dates as literal
strings. The same input must always produce the same score/verdict so the
test can't flake. (The existing `oracle.test.js`, `freshness.test.js`, and
`write_verification.test.js` are the templates — match their style.)
3. **Assert both layers.** Where the feature returns both human-readable text and
structured data, assert **both**: the structured f