Reinforcement Learning Wrappers

Reinforcement Learning Wrappers#

We provide wrappers to different reinforcement libraries. These wrappers convert the data from the environments into the respective libraries function argument and return types.

Stable-Baselines3#

  • Training an agent with Stable-Baselines3 on Isaac-Cartpole-v0:

    # install python module (for stable-baselines3)
    ./isaaclab.sh -i sb3
    # run script for training
    # note: we set the device to cpu since SB3 doesn't optimize for GPU anyway
    ./isaaclab.sh -p source/standalone/workflows/sb3/train.py --task Isaac-Cartpole-v0 --headless --device cpu
    # run script for playing with 32 environments
    ./isaaclab.sh -p source/standalone/workflows/sb3/play.py --task Isaac-Cartpole-v0 --num_envs 32 --checkpoint /PATH/TO/model.zip
    # run script for recording video of a trained agent (requires installing `ffmpeg`)
    ./isaaclab.sh -p source/standalone/workflows/sb3/play.py --task Isaac-Cartpole-v0 --headless --video --video_length 200
    
    :: install python module (for stable-baselines3)
    isaaclab.bat -i sb3
    :: run script for training
    :: note: we set the device to cpu since SB3 doesn't optimize for GPU anyway
    isaaclab.bat -p source\standalone\workflows\sb3\train.py --task Isaac-Cartpole-v0 --headless --device cpu
    :: run script for playing with 32 environments
    isaaclab.bat -p source\standalone\workflows\sb3\play.py --task Isaac-Cartpole-v0 --num_envs 32 --checkpoint /PATH/TO/model.zip
    :: run script for recording video of a trained agent (requires installing `ffmpeg`)
    isaaclab.bat -p source\standalone\workflows\sb3\play.py --task Isaac-Cartpole-v0 --headless --video --video_length 200
    

SKRL#

  • Training an agent with SKRL on Isaac-Reach-Franka-v0:

    # install python module (for skrl)
    ./isaaclab.sh -i skrl
    # run script for training
    ./isaaclab.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless
    # run script for playing with 32 environments
    ./isaaclab.sh -p source/standalone/workflows/skrl/play.py --task Isaac-Reach-Franka-v0 --num_envs 32 --checkpoint /PATH/TO/model.pt
    # run script for recording video of a trained agent (requires installing `ffmpeg`)
    ./isaaclab.sh -p source/standalone/workflows/skrl/play.py --task Isaac-Reach-Franka-v0 --headless --video --video_length 200
    
    :: install python module (for skrl)
    isaaclab.bat -i skrl
    :: run script for training
    isaaclab.bat -p source\standalone\workflows\skrl\train.py --task Isaac-Reach-Franka-v0 --headless
    :: run script for playing with 32 environments
    isaaclab.bat -p source\standalone\workflows\skrl\play.py --task Isaac-Reach-Franka-v0 --num_envs 32 --checkpoint /PATH/TO/model.pt
    :: run script for recording video of a trained agent (requires installing `ffmpeg`)
    isaaclab.bat -p source\standalone\workflows\skrl\play.py --task Isaac-Reach-Franka-v0 --headless --video --video_length 200
    
    # install python module (for skrl)
    ./isaaclab.sh -i skrl
    # install skrl dependencies for JAX. Visit https://skrl.readthedocs.io/en/latest/intro/installation.html for more details
    ./isaaclab.sh -p -m pip install skrl["jax"]
    # run script for training
    ./isaaclab.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless --ml_framework jax
    # run script for playing with 32 environments
    ./isaaclab.sh -p source/standalone/workflows/skrl/play.py --task Isaac-Reach-Franka-v0 --num_envs 32  --ml_framework jax --checkpoint /PATH/TO/model.pt
    # run script for recording video of a trained agent (requires installing `ffmpeg`)
    ./isaaclab.sh -p source/standalone/workflows/skrl/play.py --task Isaac-Reach-Franka-v0 --headless --ml_framework jax --video --video_length 200
    
    • Training the multi-agent environment Isaac-Shadow-Hand-Over-Direct-v0 with skrl:

    # install python module (for skrl)
    ./isaaclab.sh -i skrl
    # run script for training with the MAPPO algorithm (IPPO is also supported)
    ./isaaclab.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Shadow-Hand-Over-Direct-v0 --headless --algorithm MAPPO
    # run script for playing with 32 environments with the MAPPO algorithm (IPPO is also supported)
    ./isaaclab.sh -p source/standalone/workflows/skrl/play.py --task Isaac-Shadow-Hand-Over-Direct-v0 --num_envs 32 --algorithm MAPPO --checkpoint /PATH/TO/model.pt
    
    :: install python module (for skrl)
    isaaclab.bat -i skrl
    :: run script for training with the MAPPO algorithm (IPPO is also supported)
    isaaclab.bat -p source\standalone\workflows\skrl\train.py --task Isaac-Shadow-Hand-Over-Direct-v0 --headless --algorithm MAPPO
    :: run script for playing with 32 environments with the MAPPO algorithm (IPPO is also supported)
    isaaclab.bat -p source\standalone\workflows\skrl\play.py --task Isaac-Shadow-Hand-Over-Direct-v0 --num_envs 32 --algorithm MAPPO --checkpoint /PATH/TO/model.pt
    

RL-Games#

  • Training an agent with RL-Games on Isaac-Ant-v0:

    # install python module (for rl-games)
    ./isaaclab.sh -i rl_games
    # run script for training
    ./isaaclab.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-Ant-v0 --headless
    # run script for playing with 32 environments
    ./isaaclab.sh -p source/standalone/workflows/rl_games/play.py --task Isaac-Ant-v0 --num_envs 32 --checkpoint /PATH/TO/model.pth
    # run script for recording video of a trained agent (requires installing `ffmpeg`)
    ./isaaclab.sh -p source/standalone/workflows/rl_games/play.py --task Isaac-Ant-v0 --headless --video --video_length 200
    
    :: install python module (for rl-games)
    isaaclab.bat -i rl_games
    :: run script for training
    isaaclab.bat -p source\standalone\workflows\rl_games\train.py --task Isaac-Ant-v0 --headless
    :: run script for playing with 32 environments
    isaaclab.bat -p source\standalone\workflows\rl_games\play.py --task Isaac-Ant-v0 --num_envs 32 --checkpoint /PATH/TO/model.pth
    :: run script for recording video of a trained agent (requires installing `ffmpeg`)
    isaaclab.bat -p source\standalone\workflows\rl_games\play.py --task Isaac-Ant-v0 --headless --video --video_length 200
    

RSL-RL#

  • Training an agent with RSL-RL on Isaac-Reach-Franka-v0:

    # install python module (for rsl-rl)
    ./isaaclab.sh -i rsl_rl
    # run script for training
    ./isaaclab.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Reach-Franka-v0 --headless
    # run script for playing with 32 environments
    ./isaaclab.sh -p source/standalone/workflows/rsl_rl/play.py --task Isaac-Reach-Franka-v0 --num_envs 32 --load_run run_folder_name --checkpoint model.pt
    # run script for recording video of a trained agent (requires installing `ffmpeg`)
    ./isaaclab.sh -p source/standalone/workflows/rsl_rl/play.py --task Isaac-Reach-Franka-v0 --headless --video --video_length 200
    
    :: install python module (for rsl-rl)
    isaaclab.bat -i rsl_rl
    :: run script for training
    isaaclab.bat -p source\standalone\workflows\rsl_rl\train.py --task Isaac-Reach-Franka-v0 --headless
    :: run script for playing with 32 environments
    isaaclab.bat -p source\standalone\workflows\rsl_rl\play.py --task Isaac-Reach-Franka-v0 --num_envs 32 --load_run run_folder_name --checkpoint model.pt
    :: run script for recording video of a trained agent (requires installing `ffmpeg`)
    isaaclab.bat -p source\standalone\workflows\rsl_rl\play.py --task Isaac-Reach-Franka-v0 --headless --video --video_length 200
    

All the scripts above log the training progress to Tensorboard in the logs directory in the root of the repository. The logs directory follows the pattern logs/<library>/<task>/<date-time>, where <library> is the name of the learning framework, <task> is the task name, and <date-time> is the timestamp at which the training script was executed.

To view the logs, run:

# execute from the root directory of the repository
./isaaclab.sh -p -m tensorboard.main --logdir=logs
:: execute from the root directory of the repository
isaaclab.bat -p -m tensorboard.main --logdir=logs