autoresearch-agent

Featured

Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely. Use when: user wants to optimize code speed, reduce bundle/image size, improve test pass rate, optimize prompts, improve content quality (headlines, copy, CTR), or run any measurable improvement loop. Requires: a target file, an evaluation command that outputs a metric, and a git repo.

AI & Automation 23,263 stars 3198 forks Updated 1 weeks ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Autoresearch Agent > You sleep. The agent experiments. You wake up to results. Autonomous experiment loop inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). The agent edits one file, runs a fixed evaluation, keeps improvements, discards failures, and loops indefinitely. Not one guess — fifty measured attempts, compounding. --- ## Slash Commands | Command | What it does | |---------|-------------| | `/ar:setup` | Set up a new experiment interactively | | `/ar:run` | Run a single experiment iteration | | `/ar:loop` | Start autonomous loop with configurable interval (10m, 1h, daily, weekly, monthly) | | `/ar:status` | Show dashboard and results | | `/ar:resume` | Resume a paused experiment | --- ## When This Skill Activates Recognize these patterns from the user: - "Make this faster / smaller / better" - "Optimize [file] for [metric]" - "Improve my [headlines / copy / prompts]" - "Run experiments overnight" - "I want to get [metric] from X to Y" - Any request involving: optimize, benchmark, improve, experiment loop, autoresearch If the user describes a target file + a way to measure success → this skill applies. --- ## Setup ### First Time — Create the Experiment Run the setup script. The user decides where experiments live: **Project-level** (inside repo, git-tracked, shareable with team): ```bash python scripts/setup_experiment.py \ --domain engineering \ --name api-speed \ --target src/api/search.py \ --eval "pytest benc...

Details

Author: alirezarezvani
Repository: alirezarezvani/claude-skills
Created: 9 months ago
Last Updated: 1 weeks ago
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI pytest · Testing

Bundled in these plugins

claude-skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

autoresearch

Autonomous iterative experimentation loop for any programming task. Guides the user through defining goals, measurable metrics, and scope constraints, then runs an autonomous loop of code changes, testing, measuring, and keeping/discarding results. Inspired by Karpathy's autoresearch. USE FOR: autonomous improvement, iterative optimization, experiment loop, auto research, performance tuning, automated experimentation, hill climbing, try things automatically, optimize code, run experiments, autonomous coding loop. DO NOT USE FOR: one-shot tasks, simple bug fixes, code review, or tasks without a measurable metric.

14 Updated yesterday

a-tokyo

AI & Automation Solid

autoresearch

Autonomous experiment loop: edit code, commit, run benchmark, extract metrics, keep improvements or revert, repeat forever. Use this skill when the user asks to "run autoresearch", "start an experiment loop", "optimize a metric autonomously", "autonomous experiments", "benchmark loop", "keep/discard experiments", "optimize test speed", "optimize bundle size", "optimize build time", "run experiments overnight", "speed up my tests", "make my build faster", "reduce compile time", "keep trying until it's faster", "run experiments while I sleep", "overnight optimization", "edit-measure-keep loop", "autoresearch status", or mentions "autoresearch", "experiment loop", "autonomous optimization". Always use this skill when the user wants to iteratively and autonomously improve any measurable metric — even if they don't use the word "autoresearch". Also use when the user asks about the status of a running autoresearch session or wants to cancel/stop one.

12 Updated 1 weeks ago

proyecto26

AI & Automation Solid

autoresearch

Autonomous experiment loop inspired by Karpathy's autoresearch. Iteratively modifies code, runs evaluation, measures a metric, and keeps or discards changes using git. Use when optimizing code against a measurable target (test pass rate, performance, bundle size, model quality, etc).

3 Updated today

Silex-Research