← All creators

Galileo-Agent-Labs

Organization

Bring Galileo powered eval workflows into Claude Code and Codex.

8 indexed · 0 Featured · 33 stars · avg score 79
Prolific

Categories

Indexed Skills (8)

AI & Automation Listed

eval-cost

Use when the user asks to make an AI app cheaper or faster, reduce tokens, latency, model/tool/retrieval/rerank/self-check/retry/evaluator cost, or compare cost before/after.

33 Updated 2 weeks ago
Galileo-Agent-Labs
Data & Documents Listed

eval-dataset

Use when the user asks to turn a failure into an eval, create/review/accept/reject dataset cases, or convert Galileo traces, metric gaps, or production examples into cases.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-diagnose

Use when Galileo evidence is available and the user asks why a trace, session, log stream, experiment, metric, or AI app behavior failed, regressed, or became unsafe.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-engineer

Use when a user is unsure which Eval Engineer command to run for AI agents/RAG apps, needs onboarding/status for a .galileo workspace, or asks where to start.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-fetch

Use when a user asks to fetch Galileo evidence, provides Galileo URLs or IDs, says "fetch this Galileo link", or needs traces, sessions, experiments, or log streams saved locally.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-setup

Use when the user asks to set up Eval Engineer, check .galileo readiness, create workspace scaffolding, or configure editable files, verification commands, app type, or evidence paths.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-audit

Use when the user asks for an AI app audit, launch readiness review, safety/security review, OWASP agentic risk check, metric coverage review, or production RCA gap review.

33 Updated 2 weeks ago
Galileo-Agent-Labs
AI & Automation Listed

eval-measure

Use when the user asks if an AI app is measured correctly, needs Galileo metrics, expected-output contracts, metric profiles, eval gates, or measurement before optimizing.

33 Updated 2 weeks ago
Galileo-Agent-Labs

Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.