← ClaudeAtlas

stable-baselines3listed

Use this skill for reinforcement learning tasks including training RL agents (PPO, SAC, DQN, TD3, DDPG, A2C, etc.), creating custom Gym environments, implementing callbacks for monitoring and control, using vectorized environments for parallel training, and integrating with deep RL workflows. This skill should be used when users request RL algorithm implementation, agent training, environment design, or RL experimentation.
aiskillstore/marketplace · ★ 334 · AI & Automation · score 80
Install: claude install-skill aiskillstore/marketplace
# Stable Baselines3 ## Overview Stable Baselines3 (SB3) is a PyTorch-based library providing reliable implementations of reinforcement learning algorithms. This skill provides comprehensive guidance for training RL agents, creating custom environments, implementing callbacks, and optimizing training workflows using SB3's unified API. ## Core Capabilities ### 1. Training RL Agents **Basic Training Pattern:** ```python import gymnasium as gym from stable_baselines3 import PPO # Create environment env = gym.make("CartPole-v1") # Initialize agent model = PPO("MlpPolicy", env, verbose=1) # Train the agent model.learn(total_timesteps=10000) # Save the model model.save("ppo_cartpole") # Load the model (without prior instantiation) model = PPO.load("ppo_cartpole", env=env) ``` **Important Notes:** - `total_timesteps` is a lower bound; actual training may exceed this due to batch collection - Use `model.load()` as a static method, not on an existing instance - The replay buffer is NOT saved with the model to save space **Algorithm Selection:** Use `references/algorithms.md` for detailed algorithm characteristics and selection guidance. Quick reference: - **PPO/A2C**: General-purpose, supports all action space types, good for multiprocessing - **SAC/TD3**: Continuous control, off-policy, sample-efficient - **DQN**: Discrete actions, off-policy - **HER**: Goal-conditioned tasks See `scripts/train_rl_agent.py` for a complete training template with best practices. ### 2. Cu