experiment-designerlisted
Install: claude install-skill mistakenot/auto-stack
# Experiment Designer
Design and run a formal experiment with stated hypothesis, explicit pass/fail thresholds, and findings that get checked into git as durable artifacts. Optimized for the case where a research question is too important to settle by intuition and the answer is worth keeping.
## When to use vs. neighbors
- **experiment-designer** (this skill): formal research question with hypothesis + thresholds + worker dispatch + findings committed to `docs/experiments/`. Multi-phase OK.
- **tech-spike**: quick informal exploration, scratch work in `.tmp/`, no formal writeup.
- **new-task / new-solution / new-plan**: planning to build a known feature, not testing a research question.
Trigger experiment-designer when: the user states a question with multiple plausible answers, the cost of guessing wrong is high enough that a $1-30 OpenAI bill is a good trade, and the answer would inform a real design decision.
## The full lifecycle
Four phases. Each gates the next.
### Phase 1: Design
1. **Capture the hypothesis in one sentence.** "X works for Y" or "method A beats method B on metric M." If you cannot state it crisply, the experiment isn't ready yet — go back to the user with clarifying questions.
2. **Pick the metric(s) and the pass/fail threshold for each.** Vague success criteria produce vague reports. For each metric: write the number that would make you conclude "yes" and the number that would make you conclude "no." Anything in between is "partial" and needs