← ClaudeAtlas

ab-test-designlisted

Designs A/B tests with power analysis, minimum detectable effect (MDE), sample size estimation, randomization unit selection, guardrail definition, and a pre-registration checklist. Use when the user mentions experiment design, A/B test setup, power analysis, sample size, MDE, pre-registration, randomization, or asks "how should I run this experiment."
vermapragya/analytics-skill · ★ 0 · Web & Frontend · score 72
Install: claude install-skill vermapragya/analytics-skill
# A/B Test Design ## When to use this skill Use when the user is **planning** an experiment, not analyzing one. Triggers include: - "Design an A/B test for…" - "What sample size do I need…" - "How long should I run this test…" - "Pre-register this experiment" - "Pick guardrails for…" If the user already has results, use `ab-test-analysis` instead. ## Required inputs Collect these before computing anything. If missing, ask. | Input | Why it matters | |---|---| | Primary metric | Determines test type (proportion, mean, ratio) | | Baseline rate or mean | Required for power calculation | | Minimum detectable effect (MDE) | Sets sensitivity floor | | Randomization unit | User, session, account, device | | Expected daily exposure (units/day) | Determines runtime | | Variant count (control + N treatments) | Affects multiple-comparison correction | | Guardrail metrics | What must not regress | ## Workflow 1. **Confirm hypothesis is testable.** A hypothesis has the form: "Changing X will move metric Y by at least Z%, because reason R." If reason R is missing, push back. 2. **Pick the metric type.** - Binary outcome (conversion, click) -> proportion test - Continuous (revenue per user, session length) -> mean test, log-transform if skewed - Ratio (revenue per impression) -> delta method or bootstrap 3. **Set MDE conservatively.** Default to the smallest effect the team would actually act on. Do not optimize MDE to fit the runtime — that's how teams ship noise. 4. *