experiment-queue

Solid

SSH job queue for multi-seed/multi-config ML experiments with OOM-aware retry, stale-screen cleanup, and wave-transition race prevention. Use when user says "batch experiments", "队列实验", "run grid", "multi-seed sweep", "auto-chain experiments", or when /run-experiment is insufficient for 10+ jobs that need orchestration.

AI & Automation 11,152 stars 1050 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Experiment Queue > ⏱ **External cadence: visibility only.** This skill already runs its own > detached server-side scheduler (60s poll + `depends_on` + wave transitions). > Use its status output for overnight visibility (N done / N running / N > pending); do **not** wrap it in a second `/loop` / `CronCreate` poll — that > duplicates the scheduler on an uncoordinated clock and races the > wave-transition logic it was built to prevent. See > [`shared-references/external-cadence.md`](../shared-references/external-cadence.md) > ("don't duplicate an existing scheduler"). Orchestrate large batches of ML experiments on SSH remote GPU servers with proper state tracking, OOM retry, stale cleanup, and wave transitions. ## When to Use This Skill Use when `/run-experiment` is insufficient: - **≥10 jobs** that need batching across GPUs - **Multi-seed sweeps** (e.g., 21 seeds × 12 cells) - **Wave transitions** (run wave 1, wait, run wave 2, wait, run wave 3...) - **Teacher+student chains** (train teacher then distill; auto-trigger student after teacher done) - **OOM-prone configs** where you need to retry with different GPU or wait - **Mixed seed grids** where failed cells need re-running Do NOT use for: - Single ad-hoc experiment (use `/run-experiment`) - Modal/Vast.ai deployments (those have their own orchestration) - Experiments that need manual inspection between runs ## Why This Exists Based on session audit (2026-04-16), the major wall-clock sinks in multi-seed grid experime...

Details

Author: wanshuiyin
Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Created: 2 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

run-experiment

Deploy and run ML experiments on local or remote GPU servers. Use when user says "run experiment", "deploy to server", "跑实验", or needs to launch training jobs.

11,152 Updated today

wanshuiyin

AI & Automation Solid

experiment-bridge

Workflow 1.5: Bridge between idea discovery and auto review. Reads EXPERIMENT_PLAN.md, implements experiment code, deploys to GPU, collects initial results. Use when user says "实现实验", "implement experiments", "bridge", "从计划到跑实验", "deploy the plan", or has an experiment plan ready to execute.

11,152 Updated today

wanshuiyin

AI & Automation Listed

chaos-experiment

Design and document chaos engineering experiments. Guide steady state baseline, hypothesis formation, failure injection plans, and results analysis. Use when you say "design a chaos experiment", "plan a game day", "failure injection", "test resilience", or "chaos engineering". Do NOT use for security threat analysis (use threat-modeling) or pre-launch project risk identification (use pre-mortem).

34 Updated today

rjmurillo

AI & Automation Listed

swmm-experiment-audit

Consolidate Agentic SWMM run artifacts into auditable provenance, comparison records, and local Obsidian audit notes. Use after any SWMM build/run/QA attempt, successful or failed, when OpenClaw or a CLI workflow needs a traceable record of inputs, commands, artifacts, metrics, QA checks, run-to-run differences, and first-user-friendly Obsidian visualization.

8 Updated 2 days ago

Zhonghao1995

AI & Automation Solid

experiment-design

A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.

287 Updated today

rampstackco