mkevaluatelisted
Install: claude install-skill ngocsangyem/MeowKit
# mk:evaluate — Behavioral Active Verification
Step-file workflow that drives a running build, probes each rubric criterion via active verification, and produces a graded verdict with runtime evidence. Owned by the `evaluator` agent (Phase 3+).
## When to Use
Activate when:
- User runs `/mk:evaluate <target>` with a URL, file path, or running-app handle
- A generator iteration completes and the harness needs a graded verdict
- After Phase 3 (build) and before Phase 5 (ship) for frontend/fullstack/CLI products
- When asked to "grade the running app", "check the build behaviorally", or "verify against the spec"
Skip when:
- The build has no runnable artifact (pure library, type-only package)
- The task is structural code review only — use `mk:review` instead
- The task is `/mk:fix` simple — overhead exceeds value
## Hard Constraints
1. **Active verification gate** — every verdict MUST include non-empty `evidence/` directory with at least one of: screenshot, HTTP response capture, CLI stdout+exit-code transcript. `validate-verdict.sh` rejects PASS verdicts with empty evidence and converts them to FAIL.
2. **Skeptic persona enforced** — load `prompts/skeptic-persona.md` at session start. Re-anchor before each criterion grading.
3. **Max 15 criteria per session** — split into multiple sessions if rubric composition exceeds. Heuristic: context overflow risk above this threshold.
4. **No source code edits** — evaluator owns `tasks/reviews/*-evalverdict.md` only. Never modifies