neuralforge-labs

Organization

5 indexed · 0 Featured · 4 stars · avg score 55

Prolific

Indexed Skills (5)

feature-development

Use this skill for ALL implementation work beyond trivial one-liners. Handles three intensities — Light (inline TDD + self-review, no agents), Medium (abbreviated spec audit + single-round architect+tester + phase-end code-reviewer+phase-auditor), and Deep (full 7-stage recipe). Auto-classifies at Stage 0 and proceeds immediately with a one-line announcement. The user can say "go deeper" or "go lighter" to adjust at any point.

4 Updated 1 weeks ago

neuralforge-labs

AI & Automation Listed

golden-eval

Use this skill to run a fixed corpus of reference tasks against the current Claude configuration and detect regression vs a baseline. Triggers on "run golden eval", "check for drift", "did Claude get worse", "eval drift", or via a scheduled invocation (cron / Anthropic Routine via the /schedule skill). Captures cost, latency, and per-task pass/fail; flags any regression beyond a configurable threshold. Output: a JSON report at `~/.claude/skills/golden-eval/reports/<timestamp>.json` and (if regression detected) a PushNotification with the summary.

4 Updated 1 weeks ago

neuralforge-labs

AI & Automation Listed

live-evaluator

Use this skill in Stage 6 of feature-development to perform live verification with a fresh-context, skeptical-QA-framed agent — separate from the implementer that wrote the code. Triggers on "live verify this", "evaluate this end-to-end", "QA this against the deployed environment", or via feature-development Stage 6 launch. Forces evaluation to come from an adversarial reviewer, not the praised-by-its-author implementer. Output: a verification report at `specs/<feature>/E2E_VERIFICATION.md` with reproducible commands, observed evidence, and a pass/fail per acceptance criterion.

4 Updated 1 weeks ago

neuralforge-labs

Testing & QA Listed

property-test-generator

Use this skill when the user has a list of behavioral invariants for a function/module/feature and wants property-based tests generated from them. Triggers on phrases like "generate property tests", "make hypothesis tests for X", "property-test these invariants", "fuzz-test this function", or when the feature-development skill's Stage 1 spec audit identifies invariants worth exploring with Hypothesis. ALSO use proactively if you read a spec_audit.md that lists invariants without corresponding property tests — generating them now (during spec audit) is cheaper than discovering edge cases after impl. Output: a Python test file using `hypothesis` (default) or `dart_check` for Flutter, with one `@given(...)` test per invariant, calibrated input strategies, and shrink-friendly assertions. The skill does NOT run the tests — it generates them, the user runs them.

4 Updated 1 weeks ago

neuralforge-labs

Code & Development Listed

test-impact-graph

Use this skill to identify which tests are impacted by a code change — the diff-to-tests reverse dependency graph (TDAD pattern). Triggers on phrases like "which tests should I run", "test impact for this diff", "skip tests that don't matter", or after a refactor when the user wants to skip running the full suite. Reads a Python codebase and computes the transitive set of test files that import (directly or via intermediate modules) the changed source files. Output: a list of test files (paths) to run, with the import chain that connects each test to a changed file. Reduces typical CI loops 5-10x for diff-scoped changes.

4 Updated 1 weeks ago

neuralforge-labs

Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.

Categories

Indexed Skills (5)

feature-development

golden-eval

live-evaluator

property-test-generator

test-impact-graph