benchmarklisted
Install: claude install-skill luiseiman/dotforge
# Benchmark
Compare the effectiveness of a project's full dotforge configuration against a minimal baseline by executing the same standardized task in two isolated worktrees.
**Cost warning:** Each benchmark runs Claude Code twice (full + minimal). Use sparingly and only after Fases 0-2 are working.
## Prerequisites
- Project must have `.claude/settings.json` and `CLAUDE.md`
- Project must be a git repository with a clean working tree
- Task definitions must exist in `$DOTFORGE_DIR/tests/benchmark-tasks/`
## Step 1: Select task
1. Detect project stacks from `.claude/.forge-manifest.json` or infer from project files
2. Load matching task from `$DOTFORGE_DIR/tests/benchmark-tasks/{stack}.yml`
3. If multiple stacks match, let user choose or run the first match
4. If no stack matches, use `generic.yml`
Display:
```
═══ BENCHMARK SETUP ═══
Project: {{name}}
Stack detected: {{stack}}
Task: {{task title}}
Description: {{task description}}
⚠ This will run Claude Code twice in isolated worktrees.
Proceed? (yes/no)
```
## Step 2: Prepare worktrees
Create two git worktrees from the current HEAD:
1. **Full config** — `git worktree add /tmp/bench-full-{{slug}} HEAD`
- Copy entire `.claude/` directory as-is
- Copy `CLAUDE.md` as-is
2. **Minimal config** — `git worktree add /tmp/bench-minimal-{{slug}} HEAD`
- Create minimal `CLAUDE.md` with only project name and "Build & Test" section
- Create minimal `.claude/settings.json` with only `allowedTools` (no hooks, no den