backtest-comparatorlisted
Install: claude install-skill barobaonguyen/ai-automation-skills
# Backtest Comparator
Use this skill when several strategy variants or parameter sets each have per-year (or per-fold) results and you need to pick the robust one, not the one with the highest average. A variant can win on the mean while quietly losing on an individual fold; this comparator ranks by mean but also reports standard deviation and the worst fold, then flags overfit and unstable variants so they can't slip through.
## When to invoke
- User says: "compare these backtest variants" / "which parameter set is robust" / "flag the overfit one" / "year-by-year results"
- Code in the conversation uses: a sweep that produced per-fold or per-year scores for multiple variants.
## When NOT to invoke
- You only have a single aggregate number per variant (no per-fold breakdown to judge consistency).
- The task is to build the validation split itself (use [[walk-forward-runner]] first, then compare its folds here).
## Concrete example
User input:
```text
Three variants, four years of returns each. Tell me which is actually robust vs which just got lucky one year.
```
Output:
```text
variant mean std worst verdict
------------------------------------------------------
cross_only 0.188 0.027 0.150 robust
macd_filter 0.095 0.011 0.080 robust
pullback_ema21 0.075 0.205 -0.120 OVERFIT (positive mean, loses on a fold)
```
Code:
```python
# Copy assets/compare.py into your project, then:
from compare import comp