eval-agent
SolidRun evaluation tests against an agent to assess quality and archetype resistance
Install
Quality Score: 90/100
Skill Content
Details
- Author
- jmagly
- Repository
- jmagly/aiwg
- Created
- 9 months ago
- Last Updated
- yesterday
- Language
- TypeScript
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
eval-report
Generate an aggregate agent quality report from evaluation results, showing scores, regressions, and recommendations
agent-eval
Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
agent-eval
Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
agent-eval
Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
agent-eval
【Agent 评估】评估 AI Agent 输出质量。触发时机:用户说"评估 agent"、"测试 agent 质量"、"agent eval"、"检查 agent 输出"时。