← ClaudeAtlas

bayesian-experiment-readerlisted

Bayesian counterpart to experiment-result-reader. Computes posterior P(variant beats control), credible intervals, and expected loss from per-variant exposure and conversion data. Beta-Binomial for proportion metrics (CVR), Normal-Normal for continuous metrics (revenue per user). Decision rule combines a confidence threshold with an expected-loss tolerance, so the ship decision reflects both "how likely is this better?" and "how bad is it if I'm wrong?". Use this skill alongside experiment-result-reader when reading any A/B test result. Pairs with analytics-diagnostic-method. Use whenever interpreting an A/B test result the user plans to ship from, when the question is "what's the chance variant wins?", or when a frequentist p-value is on the edge and the user wants the posterior view. Triggers when Clamp MCP returns experiment exposure and conversion data, or when any analytics source surfaces per-variant counts.
clamp-sh/analytics-skills · ★ 6 · AI & Automation · score 81
Install: claude install-skill clamp-sh/analytics-skills
# Bayesian experiment reader A frequentist p-value answers a question stakeholders don't ask: "if the variants were identical, how surprising would this data be?" What they actually want is "what's the chance the variant is better?" and "if I ship it and I'm wrong, how bad is it?" Bayesian inference answers both directly. This skill encodes that math and the decision rule it enables. It pairs with `experiment-result-reader`. Run that one first for the frequentist read and the setup checks (SRM, mix shift, peeking). Run this one to translate the same per-variant counts into a posterior probability and a ship/hold/kill decision. ## When NOT to use this - The setup isn't clean. SRM, exposure-event gaps, or mix shift contaminate Bayesian math just as badly as frequentist math. Fix the setup first via `experiment-result-reader`'s Phase 1 and Phase 4. - The conversion metric is heavily right-skewed and you only have a handful of conversions per variant (e.g. revenue per user with three whales). The Normal-Normal model assumes approximately normal sampling distributions; small-sample skew breaks it. Either log-transform, bucket into a proportion, or wait for more data. - The user wants to *design* a new experiment. Sample-size planning under a Bayesian framework is a different problem (expected loss under prior + planned n). This skill reads results, it doesn't plan them. - The user wants a single number to defend a decision in a hostile review. Bayesian outputs are inherently p