senior-data-scientist

Solid

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

AI & Automation 17,886 stars 2466 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Senior Data Scientist World-class senior data scientist skill for production-grade AI/ML/Data systems. ## Core Workflows ### 1. Design an A/B Test ```python import numpy as np from scipy import stats def calculate_sample_size(baseline_rate, mde, alpha=0.05, power=0.8): """ Calculate required sample size per variant. baseline_rate: current conversion rate (e.g. 0.10) mde: minimum detectable effect (relative, e.g. 0.05 = 5% lift) """ p1 = baseline_rate p2 = baseline_rate * (1 + mde) effect_size = abs(p2 - p1) / np.sqrt((p1 * (1 - p1) + p2 * (1 - p2)) / 2) z_alpha = stats.norm.ppf(1 - alpha / 2) z_beta = stats.norm.ppf(power) n = ((z_alpha + z_beta) / effect_size) ** 2 return int(np.ceil(n)) def analyze_experiment(control, treatment, alpha=0.05): """ Run two-proportion z-test and return structured results. control/treatment: dicts with 'conversions' and 'visitors'. """ p_c = control["conversions"] / control["visitors"] p_t = treatment["conversions"] / treatment["visitors"] pooled = (control["conversions"] + treatment["conversions"]) / (control["visitors"] + treatment["visitors"]) se = np.sqrt(pooled * (1 - pooled) * (1 / control["visitors"] + 1 / treatment["visitors"])) z = (p_t - p_c) / se p_value = 2 * (1 - stats.norm.cdf(abs(z))) ci_low = (p_t - p_c) - stats.norm.ppf(1 - alpha / 2) * se ci_high = (p_t - p_c) + stats.norm.ppf(1 - alpha / 2) * se return { "lift": ...

Details

Author
alirezarezvani
Repository
alirezarezvani/claude-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

senior-data-scientist

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

2 Updated 1 weeks ago
mdnaimul22
AI & Automation Solid

senior-data-scientist

World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.

2,279 Updated 3 weeks ago
foryourhealth111-pixel
AI & Automation Solid

statistical-analyst

Run hypothesis tests, analyze A/B experiment results, calculate sample sizes, and interpret statistical significance with effect sizes. Use when you need to validate whether observed differences are real, size an experiment correctly before launch, or interpret test results with confidence.

17,886 Updated today
alirezarezvani