autoresearch-agent

Solid

Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely. Use when: user wants to optimize code speed, reduce bundle/image size, improve test pass rate, optimize prompts, improve content quality (headlines, copy, CTR), or run any measurable improvement loop. Requires: a target file, an evaluation command that outputs a metric, and a git repo.

AI & Automation 16,782 stars 2310 forks Updated 3 days ago MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Autoresearch Agent > You sleep. The agent experiments. You wake up to results. Autonomous experiment loop inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). The agent edits one file, runs a fixed evaluation, keeps improvements, discards failures, and loops indefinitely. Not one guess — fifty measured attempts, compounding. --- ## Slash Commands | Command | What it does | |---------|-------------| | `/ar:setup` | Set up a new experiment interactively | | `/ar:run` | Run a single experiment iteration | | `/ar:loop` | Start autonomous loop with configurable interval (10m, 1h, daily, weekly, monthly) | | `/ar:status` | Show dashboard and results | | `/ar:resume` | Resume a paused experiment | --- ## When This Skill Activates Recognize these patterns from the user: - "Make this faster / smaller / better" - "Optimize [file] for [metric]" - "Improve my [headlines / copy / prompts]" - "Run experiments overnight" - "I want to get [metric] from X to Y" - Any request involving: optimize, benchmark, improve, experiment loop, autoresearch If the user describes a target file + a way to measure success → this skill applies. --- ## Setup ### First Time — Create the Experiment Run the setup script. The user decides where experiments live: **Project-level** (inside repo, git-tracked, shareable with team): ```bash python scripts/setup_experiment.py \ --domain engineering \ --name api-speed \ --target src/api/search.py \ --eval "pytest benc...

Details

Author
alirezarezvani
Repository
alirezarezvani/claude-skills
Created
7 months ago
Last Updated
3 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

autoresearch

Autonomous iterative experimentation loop for any programming task. Guides the user through defining goals, measurable metrics, and scope constraints, then runs an autonomous loop of code changes, testing, measuring, and keeping/discarding results. Inspired by Karpathy's autoresearch. USE FOR: autonomous improvement, iterative optimization, experiment loop, auto research, performance tuning, automated experimentation, hill climbing, try things automatically, optimize code, run experiments, autonomous coding loop. DO NOT USE FOR: one-shot tasks, simple bug fixes, code review, or tasks without a measurable metric.

34,233 Updated today
github
AI & Automation Listed

autoresearch

Karpathy's autoresearch: autonomous ratcheting optimization loops for any artifact. A human writes program.md, the agent runs experiments with git-backed keep/revert. Trigger on "optimize this", "make this better", "iterate on", "autoresearch", "loop on this", "A/B test", "find the best version", Karpathy's loop, experiment loops, hill climbing, the ratchet pattern, or program.md workflows. Works across code, prompts, content, models, and configs.

0 Updated 3 days ago
Evarodenas
AI & Automation Listed

autoresearch

Autonomous experiment loop inspired by Karpathy's autoresearch. Iteratively modifies code, runs evaluation, measures a metric, and keeps or discards changes using git. Use when optimizing code against a measurable target (test pass rate, performance, bundle size, model quality, etc).

2 Updated 5 days ago
Silex-Research
AI & Automation Listed

autoresearch

Karpathy-pattern autoresearch — autonomous hill-climbing over a measurable metric, deep multi-agent research, or research-then-optimize. Three modes: Optimize (keep/discard ratchet), Research (STORM multi-perspective), Improve.

3 Updated yesterday
air-gapped
AI & Automation Listed

autoresearch

Check and run autonomous experiments. Query experiment status, view results dashboards, and execute iterations. TRIGGER when: user asks about experiment status, autoresearch progress, "how's the experiment going", "run another iteration", or invokes "/autoresearch". DO NOT TRIGGER when: user is working on autoresearch agent code itself.

1 Updated 1 weeks ago
DROOdotFOO