askit-evaluatelisted
Install: claude install-skill product-on-purpose/agent-skills-toolkit
# askit-evaluate
## Purpose
Assess a known, local component or plugin against `STANDARD.md`. Three modes. `conformance` (the default) runs the deterministic portable scripts and returns a per-rule report (pass / warn / error), the satisfied tier, and concrete remediation. `behavioral` runs a skill against its eval-set and judges whether it triggers and behaves as expected, delegating to `askit-quality-grader`. `review` forms a qualitative judgment (correctness, altitude, naming, whether a component is warranted), delegating to `askit-reviewer`. Only `conformance` is deterministic and gate-safe; `behavioral` and `review` are opt-in LLM-judged passes that produce evidence, never a CI gate result (Design Principle 3, ADR 0023).
## When to use
When the user asks to evaluate, audit, or check a skill or plugin, asks "what tier is this" or "what is blocking the next tier" (conformance), asks whether a skill actually triggers and behaves correctly (behavioral), or wants a qualitative review (review).
## conformance mode (default, deterministic)
1. Determine the target path (a plugin root with `library.json`, or a single skill directory with `SKILL.md`).
2. Run: `node scripts/evaluate.mjs <path> --json`.
3. Present the findings grouped by rule, the tier (for a plugin), and the remediation. Lead with errors, then warnings.
4. If there are warnings or errors, point the user at `askit-build-skill` in `improve` mode to fix them.
## behavioral mode (opt-in, LLM-judged)
1. Locate the ta