logistic-regressionlisted
Install: claude install-skill vermapragya/analytics-skill
# Logistic Regression
## When to use this skill
Use for **binary outcome prediction** where interpretability matters as much as accuracy. Triggers:
- "Model churn / conversion / fraud / adoption"
- "Predict probability of X"
- "Build a propensity model"
- "What predicts <binary outcome>?"
- "Logistic regression for…"
For complex non-linear interactions, suggest a tree-based model after fitting logistic as the baseline. For continuous outcomes, use `linear-regression`. For time-to-event, use `survival-analysis`.
## Required inputs
| Input | Why it matters |
|---|---|
| Binary target | What you're predicting (must be 0/1 or boolean) |
| Feature set | Predictors, with clear definitions |
| Observation grain | Per user / per session / per opportunity |
| Time cutoffs | Feature window must end BEFORE outcome window starts (leakage check) |
| Train/test strategy | Temporal split for production models, random split for exploratory |
## Workflow
1. **Audit the data first** (use `data-quality-audit` skill).
2. **Check class balance.**
```python
print(y.value_counts(normalize=True))
```
- Balanced (40-60%): standard logistic
- Mild imbalance (10-40%): use `class_weight='balanced'` and look at PR-AUC, not ROC-AUC alone
- Severe (< 5%): consider downsampling, but always evaluate calibration on the original distribution
3. **Verify no leakage.** The most common DS bug. Ask:
- Is any feature derived from data *after* the prediction time?
- Is the target ev