← ClaudeAtlas

cross-run-consistencylisted

Runs the same test N times in one session, diffs the outputs, and classifies non-determinism by root cause. Complements observability's historical flake tracking with an immediate "does this test agree with itself right now?" answer. Gate contract — P0 scenarios must be strict-consistent (same output on N/N runs), no tolerance fuzzing, no silent averaging. PIPELINE-5 step 3.
mytechsonamy/VibeFlow · ★ 0 · AI & Automation · score 75
Install: claude install-skill mytechsonamy/VibeFlow
# Cross-Run Consistency An L2 Truth-Execution skill. It answers a specific question: **"If I run this test right now, five times in a row, will the five runs agree with each other?"** That question is different from "has this test been flaky in the past", which is what `observability` MCP's `ob_track_flaky` tool answers. Historical flakiness looks backward across time, at runs that were separated by code changes, environment drift, and other noise. Cross-run consistency looks forward, in one session, against an unchanged codebase — any disagreement is pure non-determinism, because nothing else could have caused it. Flaky tests that only show up historically usually hide behind timing wobble; flaky tests that show up cross-run are faster to diagnose because the search space is smaller. ## When You're Invoked - **PIPELINE-5 step 3** — pre-release, after the regression suite has produced a clean baseline. A cross-run on the critical-path scenarios before shipping catches the last- mile non-determinism that a single green run can hide. - **On demand** as `/vibeflow:cross-run-consistency <scenario-glob> [--runs N] [--mode strict|tolerant]`. - **From `regression-test-runner`** when a test classified `flaky` in the baseline needs a fresh, session-local reproduction attempt before `release-decision-engine` uses it as a hard blocker. ## Input Contract | Input | Required | Notes | |-------|----------|-------| | Test scenario(s) | yes | Glob or explicit list. Matche