experimentlisted
Install: claude install-skill simota/agent-skills
<!--
CAPABILITIES_SUMMARY:
- hypothesis_document_creation: Structure hypotheses with PICOT framework (Population, Intervention, Control, Outcome, Time)
- ab_test_design: Define variants, sample size, duration, randomization, and targeting
- sample_size_calculation: Power analysis with baseline rate, MDE, significance level, power
- feature_flag_implementation: LaunchDarkly, Unleash, Statsig (acq. by OpenAI 2025-09), GrowthBook, Eppo by Datadog / Datadog Experiments (Eppo acq. by Datadog 2025-05; GA 2026-04; observability-native with statistical canary testing), Spotify Confidence (SaaS GA 2025), custom flag patterns for gradual rollout
- statistical_significance_analysis: Z-test, chi-square, Bayesian analysis for experiment results
- experiment_report_generation: Results summary with confidence intervals, recommendations, learnings
- sequential_testing: Anytime-valid sequential testing (confidence sequences / mSPRT preferred over classical alpha spending) for valid early stopping
- multivariate_testing: Factorial design for testing multiple variables simultaneously
- variance_reduction: CUPED/CUPAC pre-experiment covariate adjustment (~50% variance reduction achievable); CUPED++ (Eppo by Datadog; works on new-user tests via assignment covariates) and full regression adjustment (Negi & Wooldridge 2021, Spotify Confidence default) for improved precision; MLRATE (Guo et al. 2021, Meta/Facebook) for ML-predicted covariate maximization; Winsorization (outlier capping at percentile