pufferlib

Solid

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

AI & Automation 26,817 stars 2774 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# PufferLib - High-Performance Reinforcement Learning ## Overview PufferLib is a high-performance reinforcement learning library designed for fast parallel environment simulation and training. It achieves training at millions of steps per second through optimized vectorization, native multi-agent support, and efficient PPO implementation (PuffeRL). The library provides the Ocean suite of 20+ environments and seamless integration with Gymnasium, PettingZoo, and specialized RL frameworks. ## When to Use This Skill Use this skill when: - **Training RL agents** with PPO on any environment (single or multi-agent) - **Creating custom environments** using the PufferEnv API - **Optimizing performance** for parallel environment simulation (vectorization) - **Integrating existing environments** from Gymnasium, PettingZoo, Atari, Procgen, etc. - **Developing policies** with CNN, LSTM, or custom architectures - **Scaling RL** to millions of steps per second for faster experimentation - **Multi-agent RL** with native multi-agent environment support ## Core Capabilities ### 1. High-Performance Training (PuffeRL) PuffeRL is PufferLib's optimized PPO+LSTM training algorithm achieving 1M-4M steps/second. **Quick start training:** ```bash # CLI training puffer train procgen-coinrun --train.device cuda --train.learning-rate 3e-4 # Distributed training torchrun --nproc_per_node=4 train.py ``` **Python training loop:** ```python import pufferlib from pufferlib import PuffeRL # Create v...

Details

Author
K-Dense-AI
Repository
K-Dense-AI/scientific-agent-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

pufferlib

This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.

2,210 Updated 1 weeks ago
foryourhealth111-pixel
AI & Automation Solid

pufferlib

This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.

27,705 Updated today
davila7
AI & Automation Listed

pufferlib

This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.

335 Updated today
aiskillstore
AI & Automation Solid

stable-baselines3

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

26,817 Updated today
K-Dense-AI
AI & Automation Solid

stable-baselines3

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

2,210 Updated 1 weeks ago
foryourhealth111-pixel