← ClaudeAtlas

agentic-evallisted

Patterns for agent self-improvement through iterative evaluation and refinement -- generate, evaluate, critique, refine loops that move beyond single-shot generation.
fabioc-aloha/Alex_Skill_Mall · ★ 1 · AI & Automation · score 80
Install: claude install-skill fabioc-aloha/Alex_Skill_Mall
# Agentic Evaluation Patterns Patterns for self-improvement through iterative evaluation and refinement. ## Overview Evaluation patterns enable agents to assess and improve their own outputs, moving beyond single-shot generation to iterative refinement loops. ``` Generate → Evaluate → Critique → Refine → Output ↑ │ └──────────────────────────────┘ ``` ## When to Use - **Quality-critical generation**: Code, reports, analysis requiring high accuracy - **Tasks with clear evaluation criteria**: Defined success metrics exist - **Content requiring specific standards**: Style guides, compliance, formatting --- ## Pattern 1: Basic Reflection Agent evaluates and improves its own output through self-critique. ```python def reflect_and_refine(task: str, criteria: list[str], max_iterations: int = 3) -> str: """Generate with reflection loop.""" output = llm(f"Complete this task:\n{task}") for i in range(max_iterations): # Self-critique critique = llm(f""" Evaluate this output against criteria: {criteria} Output: {output} Rate each: PASS/FAIL with feedback as JSON. """) critique_data = json.loads(critique) all_pass = all(c["status"] == "PASS" for c in critique_data.values()) if all_pass: return output # Refine based on critique failed = {k: v["feedback"] for k, v in critique_data.items() if v["status"] == "FAIL