cli-eval

Solid

Create and run evaluation suites, watch live benchmark progress, view scorecards, compare model performance, and integrate eval runs with CI workflows from the CLI.

AI & Automation 6,067 stars 1058 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

<!-- generated by src/lib/agentSkills/generator.ts; manual edits will be overwritten --> ## Overview Create and run evaluation suites, watch live benchmark progress, view scorecards, compare model performance, and integrate eval runs with CI workflows from the CLI. ## Quick install ```bash npm install -g omniroute # or: npx omniroute omniroute --version ``` ## Subcommands ### `eval` **Example:** ```bash omniroute eval ``` ### `eval suites` **Example:** ```bash omniroute eval suites ``` ### `eval list` **Example:** ```bash omniroute eval list ``` ### `eval get <suiteId>` **Example:** ```bash omniroute eval get <suiteId> ``` ### `eval create` **Flags:** - `--file <path>` **Example:** ```bash omniroute eval create ``` ### `eval run <suiteId>` **Flags:** - `-m, --model <id>` - `--combo <name>` - `--concurrency <n>` - `--tag <tag>` - `--watch` **Example:** ```bash omniroute eval run <suiteId> ``` ### `eval list` **Flags:** - `--suite <id>` - `--status <s>` - `--since <ts>` - `--limit <n>` **Example:** ```bash omniroute eval list ``` ### `eval get <runId>` **Example:** ```bash omniroute eval get <runId> ``` ### `eval results <runId>` **Flags:** - `--failed` **Example:** ```bash omniroute eval results <runId> ``` ### `eval cancel <runId>` **Flags:** - `--yes` **Example:** ```bash omniroute eval cancel <runId> ``` ### `eval scorecard <runId>` **Example:** ```bash omniroute eval scorecard <runId> ``` ### `simulate [prompt]` **Flags:*...

Details

Author
diegosouzapw
Repository
diegosouzapw/OmniRoute
Created
3 months ago
Last Updated
today
Language
TypeScript
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category