ab-test-analysis

Solid

Analyze A/B test results with statistical significance, sample size validation, confidence intervals, and ship/extend/stop recommendations. Use when evaluating experiment results, checking if a test reached significance, interpreting split test data, or deciding whether to ship a variant.

Testing & QA 9,767 stars 1066 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 88/100

Stars 20%
100
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

## A/B Test Analysis Evaluate A/B test results with statistical rigor and translate findings into clear product decisions. ### Context You are analyzing A/B test results for **$ARGUMENTS**. If the user provides data files (CSV, Excel, or analytics exports), read and analyze them directly. Generate Python scripts for statistical calculations when needed. ### Instructions 1. **Understand the experiment**: - What was the hypothesis? - What was changed (the variant)? - What is the primary metric? Any guardrail metrics? - How long did the test run? - What is the traffic split? 2. **Validate the test setup**: - **Sample size**: Is the sample large enough for the expected effect size? - Use the formula: n = (Z²α/2 × 2 × p × (1-p)) / MDE² - Flag if the test is underpowered (<80% power) - **Duration**: Did the test run for at least 1-2 full business cycles? - **Randomization**: Any evidence of sample ratio mismatch (SRM)? - **Novelty/primacy effects**: Was there enough time to wash out initial behavior changes? 3. **Calculate statistical significance**: - **Conversion rate** for control and variant - **Relative lift**: (variant - control) / control × 100 - **p-value**: Using a two-tailed z-test or chi-squared test - **Confidence interval**: 95% CI for the difference - **Statistical significance**: Is p < 0.05? - **Practical significance**: Is the lift meaningful for the business? If the user provides raw data, generate...

Details

Author
phuryn
Repository
phuryn/pm-skills
Created
1 months ago
Last Updated
1 months ago
Language
N/A
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

Data & Documents Solid

analysis-planner

Structure analysis investigations before diving into data, preventing wasted time and ensuring thoroughness. Use when users need to plan any significant analysis, investigate KPI changes, respond to stakeholder questions, plan feature/experiment analysis, or when previous analyses were unfocused. Helps define clear goals, generate testable hypotheses, create systematic analysis roadmaps, identify required data, estimate timelines, and prevent analysis paralysis.

14 Updated 1 months ago
florianbonnet14
AI & Automation Solid

strategy-compare

Compare multiple strategies or directions (long vs short vs both) on the same symbol. Generates side-by-side stats table.

119 Updated 1 months ago
marketcalls
AI & Automation Solid

critical-analyst

Deep critical analysis of any text, document, code, or specification to find contradictions (e.g., code does X but spec says Y), ambiguities (vague terms, undefined criteria, multiple interpretations), inconsistencies (different names for the same concept), and logical gaps (missing steps in reasoning chains) — along with suggestions on how to fix each issue. ALWAYS use this skill when the user asks to review, critique, or analyze a document, codebase, spec, requirements, architecture decision, step-by-step explanation, or any text for quality issues. Use it for requests like "find problems with", "review critically", "check for contradictions", "verify consistency", "analyze for issues", "revisar documento", "analisar especificação", or "encontrar problemas em".

19 Updated 6 days ago
glaucia86
AI & Automation Solid

anomaly-scan

Detect marketing anomalies. Use when: traffic drops, cost spikes, conversion changes, deliverability issues, budget overruns.

46 Updated 1 weeks ago
indranilbanerjee
Code & Development Solid

analyze-dashboard

Deeply analyze Amplitude dashboards by analyzing key charts, surfacing top areas for concern and takeaways, identify anomalies, then explain changes using customer feedback trends.

71 Updated 1 weeks ago
amplitude