← ClaudeAtlas

ab-test-analysislisted

Analyzes A/B test results with significance testing, confidence intervals, sample ratio mismatch check, guardrail evaluation, and a stakeholder-ready readout. Use when the user mentions A/B test results, experiment readout, test analysis, lift, significance, p-value, treatment vs control, or asks "did the experiment work."
vermapragya/analytics-skill · ★ 0 · Testing & QA · score 72
Install: claude install-skill vermapragya/analytics-skill
# A/B Test Analysis ## When to use this skill The experiment has **finished** (or has reached planned sample size) and the user needs to interpret the results. Triggers: - "Analyze this experiment…" - "Did the test win?" - "Was the lift significant?" - "Write a readout for experiment X" - "Compare treatment vs control on…" If the experiment is still being planned, use `ab-test-design`. ## Required inputs | Input | Format | |---|---| | Per-unit assignment data | `unit_id, variant, metric_value` (or aggregate) | | Variant labels | Which is control | | Primary metric definition | From the pre-registration | | Guardrail metrics | From the pre-registration | | Test design | Sample size targets, MDE, allocation | If pre-registration is missing, **flag it loudly** in the readout. Post-hoc analysis without a pre-reg should be labeled exploratory. ## Workflow 1. **Sanity check the data.** - Verify variant labels match the design - Confirm there's exactly one record per unit per variant - Check date range matches the experiment window - Strip any users who appeared in multiple variants (assignment errors) 2. **Run Sample Ratio Mismatch (SRM) check.** - Compute observed vs expected ratio - χ² test against design allocation - If p < 0.001, **stop**. SRM means broken assignment — results are invalid. 3. **Compute primary metric per variant.** - Point estimate - 95% confidence interval (use bootstrap for ratio metrics) - Absolute lift and relative l