run-pipelinelisted
Install: claude install-skill morganmuli/metaskill
You are executing the full data science pipeline for this project. Run each stage sequentially, verifying success before proceeding to the next stage. Stop immediately if any stage fails and report the error clearly.
## Dynamic Context
Current branch: !`git branch --show-current`
Data directory contents: !`ls data/ 2>/dev/null || echo "No data/ directory found"`
Available configs: !`ls configs/*.yaml 2>/dev/null || ls configs/*.toml 2>/dev/null || echo "No config files found"`
Python environment: !`which python3 && python3 --version 2>/dev/null || echo "Python not found"`
Recent changes: !`git diff --stat HEAD~3 2>/dev/null || echo "No recent commits"`
## Configuration
If the user provided a config file as an argument, use it: `$ARGUMENTS`
Otherwise, look for the default config at `configs/experiment.yaml` or `configs/experiment.toml`.
## Pipeline Stages
Execute each stage in order. After each stage, check for errors and verify outputs exist before proceeding.
### Stage 1: Environment Check
Verify the Python environment is ready:
```bash
python3 -c "import torch; import pandas; import numpy; print(f'PyTorch {torch.__version__}, pandas {pandas.__version__}, NumPy {numpy.__version__}')"
```
If imports fail, report which packages are missing and suggest `pip install -r requirements.txt`.
### Stage 2: Data Validation
Run data validation on the raw data:
```bash
python3 -m src.data.validate --data-dir data/raw/
```
If the validation script does not exist, look for al