← ClaudeAtlas

multi-model-routinglisted

Use when wiring multiple LLM providers / models into one application and you want to pick the cheapest model per task type without hand-coding the routing. Generates a router with per-task-type model selection, per-call cost telemetry, and a dashboard-ready emit. Triggers on: 'route between Claude and GPT', 'cheapest model for X', 'multi-model dispatch', 'LLM cost telemetry', 'which model should I use for vocab extraction', 'reduce LLM bill'.
mickolasjae/mick-applied-ai-toolkit · ★ 0 · AI & Automation · score 70
Install: claude install-skill mickolasjae/mick-applied-ai-toolkit
# Multi-Model Routing Scaffold a cost-aware LLM router that picks the cheapest model that clears each task type's quality bar, then logs per-call cost so you can keep optimizing. ## 1. When to use this skill Trigger any time the application has — or will soon have — more than one model in play and you want to stop hand-coding which one gets called where. Phrases that should fire this skill: - "route between Claude and GPT" - "cheapest model for [task]" - "multi-model dispatch" - "LLM cost telemetry" - "which model should I use for vocab extraction / classification / RAG / X" - "reduce our LLM bill" - "switch this call to a cheaper model" This is the LLM analog of the "right tool for the job" rule. Do not use Opus to classify a string. Do not use Haiku to plan a 12-step agent run. The router enforces that discipline at the call site. ## 2. The task-taxonomy approach The verified pattern (from Mercor's scoring module, README §5/9/10) is: > "We choose the model based on the type of rubric item — not all items need the same LLM." Concretely, they route: - **Forms (text-only)** → `o4-mini` — cheap, fast, text-only - **Interviews (video + audio + transcript)** → `gemini-2.5-flash` — multimodal capable That's the whole idea, generalized: 1. **Enumerate the task types** in your application. Not endpoints, not services — *task types*. e.g. "classify intent", "extract vocab pairs", "synthesize RAG answer", "plan multi-step agent action", "transcribe voice memo", "reason over