senior-data-scientist

Solid

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

AI & Automation 17,886 stars 2466 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Senior Data Scientist World-class senior data scientist skill for production-grade AI/ML/Data systems. ## Core Workflows ### 1. Design an A/B Test ```python import numpy as np from scipy import stats def calculate_sample_size(baseline_rate, mde, alpha=0.05, power=0.8): """ Calculate required sample size per variant. baseline_rate: current conversion rate (e.g. 0.10) mde: minimum detectable effect (relative, e.g. 0.05 = 5% lift) """ p1 = baseline_rate p2 = baseline_rate * (1 + mde) effect_size = abs(p2 - p1) / np.sqrt((p1 * (1 - p1) + p2 * (1 - p2)) / 2) z_alpha = stats.norm.ppf(1 - alpha / 2) z_beta = stats.norm.ppf(power) n = ((z_alpha + z_beta) / effect_size) ** 2 return int(np.ceil(n)) def analyze_experiment(control, treatment, alpha=0.05): """ Run two-proportion z-test and return structured results. control/treatment: dicts with 'conversions' and 'visitors'. """ p_c = control["conversions"] / control["visitors"] p_t = treatment["conversions"] / treatment["visitors"] pooled = (control["conversions"] + treatment["conversions"]) / (control["visitors"] + treatment["visitors"]) se = np.sqrt(pooled * (1 - pooled) * (1 / control["visitors"] + 1 / treatment["visitors"])) z = (p_t - p_c) / se p_value = 2 * (1 - stats.norm.cdf(abs(z))) ci_low = (p_t - p_c) - stats.norm.ppf(1 - alpha / 2) * se ci_high = (p_t - p_c) + stats.norm.ppf(1 - alpha / 2) * se return { "lift": ...

Details

Author: alirezarezvani
Repository: alirezarezvani/claude-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

senior-data-scientist

2 Updated 1 weeks ago

mdnaimul22

AI & Automation Solid

senior-data-scientist

World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.

2,279 Updated 3 weeks ago

foryourhealth111-pixel

AI & Automation Solid

statistical-analyst

Run hypothesis tests, analyze A/B experiment results, calculate sample sizes, and interpret statistical significance with effect sizes. Use when you need to validate whether observed differences are real, size an experiment correctly before launch, or interpret test results with confidence.

17,886 Updated today

alirezarezvani