← ClaudeAtlas

askit-evaluatelisted

Evaluates a skill or plugin against the Advanced Skill Library Standard across three modes, producing deterministic conformance findings and a tier, an opt-in behavioral pass, and a qualitative review. Use when you want to audit conformance, judge whether a skill behaves and triggers correctly, get a qualitative review, or see what blocks the next tier.
product-on-purpose/agent-skills-toolkit · ★ 0 · AI & Automation · score 75
Install: claude install-skill product-on-purpose/agent-skills-toolkit
# askit-evaluate ## Purpose Assess a known, local component or plugin against `STANDARD.md`. Three modes. `conformance` (the default) runs the deterministic portable scripts and returns a per-rule report (pass / warn / error), the satisfied tier, and concrete remediation. `behavioral` runs a skill against its eval-set and judges whether it triggers and behaves as expected, delegating to `askit-quality-grader`. `review` forms a qualitative judgment (correctness, altitude, naming, whether a component is warranted), delegating to `askit-reviewer`. Only `conformance` is deterministic and gate-safe; `behavioral` and `review` are opt-in LLM-judged passes that produce evidence, never a CI gate result (Design Principle 3, ADR 0023). ## When to use When the user asks to evaluate, audit, or check a skill or plugin, asks "what tier is this" or "what is blocking the next tier" (conformance), asks whether a skill actually triggers and behaves correctly (behavioral), or wants a qualitative review (review). ## conformance mode (default, deterministic) 1. Determine the target path (a plugin root with `library.json`, or a single skill directory with `SKILL.md`). 2. Run: `node scripts/evaluate.mjs <path> --json`. 3. Present the findings grouped by rule, the tier (for a plugin), and the remediation. Lead with errors, then warnings. 4. If there are warnings or errors, point the user at `askit-build-skill` in `improve` mode to fix them. ## behavioral mode (opt-in, LLM-judged) 1. Locate the ta