bernstein-quality

Solid

Show quality metrics for Bernstein runs - success rates per model, lint/test pass rates, completion time distributions. Use when the user asks about quality, reliability, which model performs best, or pass rates.

AI & Automation 744 stars 78 forks Updated today Apache-2.0

Install

View on GitHub

Quality Score: 86/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Bernstein Quality Metrics Analyze quality and reliability of agent-generated code. ## When to Use - User asks "how reliable are the agents?" or "which model is best?" - User wants success rates, pass rates, or completion time stats - User asks about test failures or lint issues across models - User says "show me quality metrics" ## Instructions 1. Run `scripts/quality.sh metrics` for overall quality metrics. 2. Run `scripts/quality.sh pass-rates` for lint/typecheck/test pass rates by model. 3. Run `scripts/quality.sh times` for completion time distributions. 4. Present a quality dashboard: ``` ## Quality Dashboard ### Success Rate by Model | Model | Tasks | Success | Fail | Rate | |-------|-------|---------|------|------| | claude-sonnet-4 | 24 | 22 | 2 | 91.7% | | gpt-4.1 | 12 | 10 | 2 | 83.3% | ### Pass Rates | Check | Overall | claude-sonnet-4 | gpt-4.1 | |-------|---------|-----------------|---------| | Lint | 96% | 98% | 92% | | Type-check | 88% | 91% | 83% | | Tests | 85% | 89% | 75% | ### Completion Times | Percentile | Time | |------------|------| | p50 | 3m 20s | | p90 | 8m 45s | | p99 | 15m 12s | ``` 5. Highlight any models with significantly lower pass rates. 6. Recommend model routing adjustments if one model consistently underperforms.

Details

Author: sipyourdrink-ltd
Repository: sipyourdrink-ltd/bernstein
Created: 4 months ago
Last Updated: today
Language: Python
License: Apache-2.0

Integrates with

OpenAI · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

bernstein-cost

Show detailed cost breakdown and budget status for the Bernstein orchestrator. Use when the user asks about spending, budget, cost per model, cost per agent, or wants a cost projection.

744 Updated today

sipyourdrink-ltd

AI & Automation Solid

bernstein-status

Show Bernstein orchestrator status - active agents, task progress, costs, and alerts. Use when the user asks about orchestrator status, what agents are doing, task progress, how much has been spent, or what's happening with the build.

744 Updated today

sipyourdrink-ltd

AI & Automation Solid

bernstein-agents

Manage Bernstein agents - list active agents, inspect their output, kill stalled agents, or stream live logs. Use when the user asks about agents, wants to see what an agent is doing, or needs to kill one.

744 Updated today

sipyourdrink-ltd