bernstein-quality
SolidShow quality metrics for Bernstein runs - success rates per model, lint/test pass rates, completion time distributions. Use when the user asks about quality, reliability, which model performs best, or pass rates.
Install
Quality Score: 89/100
Skill Content
Details
- Author
- sipyourdrink-ltd
- Repository
- sipyourdrink-ltd/bernstein
- Created
- 2 months ago
- Last Updated
- today
- Language
- Python
- License
- Apache-2.0
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
bernstein-cost
Show detailed cost breakdown and budget status for the Bernstein orchestrator. Use when the user asks about spending, budget, cost per model, cost per agent, or wants a cost projection.
benchmark
Run metric quality benchmark, store results, and compare against previous runs. Invoke with /benchmark, "run benchmark", "benchmark metrics", "check metric quality".
bernstein-status
Show Bernstein orchestrator status - active agents, task progress, costs, and alerts. Use when the user asks about orchestrator status, what agents are doing, task progress, how much has been spent, or what's happening with the build.
bernstein-agents
Manage Bernstein agents - list active agents, inspect their output, kill stalled agents, or stream live logs. Use when the user asks about agents, wants to see what an agent is doing, or needs to kill one.
cost-quality-frontier
Adds cost (input + output tokens × model price) and latency (p50, p95) to eval results, plots model options on a Pareto frontier, and produces a quality-per-dollar composite score so production model selection is grounded in trade-offs, not just quality. Use when: model comparison, cost-aware evals, latency budget, quality-per-dollar, Pareto frontier, model selection, eval economics, picking a model for production.