recipes-patternslisted
Install: claude install-skill choxos/BiostatAgent
# Recipes Feature Engineering Patterns
## Overview
Comprehensive patterns for feature engineering using the recipes package. Covers preprocessing steps for numeric, categorical, and text data while preventing information leakage.
## Recipe Fundamentals
### Basic Recipe Structure
```r
library(recipes)
# Initialize recipe with formula
rec <- recipe(outcome ~ ., data = training_data)
# Or with explicit roles
rec <- recipe(training_data) |>
update_role(outcome, new_role = "outcome") |>
update_role(id_column, new_role = "ID") |>
update_role(-outcome, -id_column, new_role = "predictor")
```
### Selector Functions
```r
# Type-based selectors
all_predictors()
all_outcomes()
all_numeric_predictors()
all_nominal_predictors()
all_numeric()
all_nominal()
# Name-based selectors
starts_with("prefix_")
ends_with("_suffix")
contains("pattern")
matches("regex")
one_of(c("var1", "var2"))
```
## Numeric Preprocessing
### Normalization and Scaling
```r
rec <- recipe(outcome ~ ., data = train) |>
# Center and scale (z-score)
step_normalize(all_numeric_predictors()) |>
# Scale to [0, 1]
step_range(all_numeric_predictors(), min = 0, max = 1) |>
# Center only
step_center(all_numeric_predictors()) |>
# Scale only
step_scale(all_numeric_predictors())
```
### Transformations for Normality
```r
rec <- recipe(outcome ~ ., data = train) |>
# Yeo-Johnson (handles zero and negative values)
step_YeoJohnson(all_numeric_predictors()) |>
# Box-Cox (positive values