Modifying an existing Direct RL Environment#

Having learnt how to create a task in Creating a Direct Workflow RL Environment, register it in Registering an Environment, and train it in Training with an RL Agent, we will now look at how to make minor modifications to an existing task.

Sometimes it is necessary to create, due to complexity or variations from existing examples, tasks from scratch. However, in certain situations, it is possible to start from the existing code and introduce minor changes, one by one, to transform them according to our needs.

In this tutorial, we will make minor modifications to the direct workflow Humanoid task to change the simple humanoid model to the Unitree H1 humanoid robot without affecting the original code.

The Base Code#

For this tutorial, we start from the direct workflow Humanoid environment defined in omni.isaac.lab_tasks.direct.humanoid module.

Code for humanoid_env.py
 1# Copyright (c) 2022-2024, The Isaac Lab Project Developers.
 2# All rights reserved.
 3#
 4# SPDX-License-Identifier: BSD-3-Clause
 5
 6from __future__ import annotations
 7
 8from omni.isaac.lab_assets import HUMANOID_CFG
 9
10import omni.isaac.lab.sim as sim_utils
11from omni.isaac.lab.assets import ArticulationCfg
12from omni.isaac.lab.envs import DirectRLEnvCfg
13from omni.isaac.lab.scene import InteractiveSceneCfg
14from omni.isaac.lab.sim import SimulationCfg
15from omni.isaac.lab.terrains import TerrainImporterCfg
16from omni.isaac.lab.utils import configclass
17
18from omni.isaac.lab_tasks.direct.locomotion.locomotion_env import LocomotionEnv
19
20
21@configclass
22class HumanoidEnvCfg(DirectRLEnvCfg):
23    # env
24    episode_length_s = 15.0
25    decimation = 2
26    action_scale = 1.0
27    action_space = 21
28    observation_space = 75
29    state_space = 0
30
31    # simulation
32    sim: SimulationCfg = SimulationCfg(dt=1 / 120, render_interval=decimation)
33    terrain = TerrainImporterCfg(
34        prim_path="/World/ground",
35        terrain_type="plane",
36        collision_group=-1,
37        physics_material=sim_utils.RigidBodyMaterialCfg(
38            friction_combine_mode="average",
39            restitution_combine_mode="average",
40            static_friction=1.0,
41            dynamic_friction=1.0,
42            restitution=0.0,
43        ),
44        debug_vis=False,
45    )
46
47    # scene
48    scene: InteractiveSceneCfg = InteractiveSceneCfg(num_envs=4096, env_spacing=4.0, replicate_physics=True)
49
50    # robot
51    robot: ArticulationCfg = HUMANOID_CFG.replace(prim_path="/World/envs/env_.*/Robot")
52    joint_gears: list = [
53        67.5000,  # lower_waist
54        67.5000,  # lower_waist
55        67.5000,  # right_upper_arm
56        67.5000,  # right_upper_arm
57        67.5000,  # left_upper_arm
58        67.5000,  # left_upper_arm
59        67.5000,  # pelvis
60        45.0000,  # right_lower_arm
61        45.0000,  # left_lower_arm
62        45.0000,  # right_thigh: x
63        135.0000,  # right_thigh: y
64        45.0000,  # right_thigh: z
65        45.0000,  # left_thigh: x
66        135.0000,  # left_thigh: y
67        45.0000,  # left_thigh: z
68        90.0000,  # right_knee
69        90.0000,  # left_knee
70        22.5,  # right_foot
71        22.5,  # right_foot
72        22.5,  # left_foot
73        22.5,  # left_foot
74    ]
75
76    heading_weight: float = 0.5
77    up_weight: float = 0.1
78
79    energy_cost_scale: float = 0.05
80    actions_cost_scale: float = 0.01
81    alive_reward_scale: float = 2.0
82    dof_vel_scale: float = 0.1
83
84    death_cost: float = -1.0
85    termination_height: float = 0.8
86
87    angular_velocity_scale: float = 0.25
88    contact_force_scale: float = 0.01
89
90
91class HumanoidEnv(LocomotionEnv):
92    cfg: HumanoidEnvCfg
93
94    def __init__(self, cfg: HumanoidEnvCfg, render_mode: str | None = None, **kwargs):
95        super().__init__(cfg, render_mode, **kwargs)

The Changes Explained#

Duplicating the file and registering a new task#

To avoid modifying the code of the existing task, we will make a copy of the file containing the Python code and perform the modification on this copy. Then, in the Isaac Lab project source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/humanoid folder we make a copy of the humanoid_env.py file and rename it to h1_env.py.

Open the h1_env.py file in a code editor and replace all the humanoid task name (HumanoidEnv) and its configuration (HumanoidEnvCfg) instances to H1Env and H1EnvCfg respectively. This is necessary to avoid name conflicts during import when registering the environment.

Once the name change has been made, we proceed to add a new entry to register the task under the name Isaac-H1-Direct-v0. To do this, we modify the __init__.py file in the same working folder and add the following entry. Refer to the Registering an Environment tutorial for more details about environment registrations.

Hint

If the changes in the task are minimal, it is very likely that the same RL library agent configurations can be used to train it successfully. Otherwise, it is advisable to create new configuration files (adjusting their name during registration under the kwargs parameter) to avoid altering the original configurations.

from .h1_env import H1Env, H1EnvCfg
gym.register(
    id="Isaac-H1-Direct-v0",
    entry_point="omni.isaac.lab_tasks.direct.humanoid:H1Env",
    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": H1EnvCfg,
        "rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
        "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:HumanoidPPORunnerCfg",
        "skrl_cfg_entry_point": f"{agents.__name__}:skrl_ppo_cfg.yaml",
    },
)

Changing the robot#

The H1EnvCfg class (in the new created h1_env.py file) encapsulates the configuration values of the environment, including the assets to be instantiated. Particularly in this example, the robot property holds the target articulation configuration.

Since the Unitree H1 robot is included in the Isaac Lab assets extension (omni.isaac.lab_assets) we can just import it and do the replacement directly (under the H1EnvCfg.robot property), as shown below. Note that we also need to modify the joint_gears property as it holds robot-specific configuration values.

Hint

If the target robot is not included in the Isaac Lab assets extension, it is possible to load and configure it, from a USD file, by using the ArticulationCfg class.

from omni.isaac.lab_assets import H1_CFG
robot: ArticulationCfg = H1_CFG.replace(prim_path="/World/envs/env_.*/Robot")
joint_gears: list = [
    50.0,  # left_hip_yaw
    50.0,  # right_hip_yaw
    50.0,  # torso
    50.0,  # left_hip_roll
    50.0,  # right_hip_roll
    50.0,  # left_shoulder_pitch
    50.0,  # right_shoulder_pitch
    50.0,  # left_hip_pitch
    50.0,  # right_hip_pitch
    50.0,  # left_shoulder_roll
    50.0,  # right_shoulder_roll
    50.0,  # left_knee
    50.0,  # right_knee
    50.0,  # left_shoulder_yaw
    50.0,  # right_shoulder_yaw
    50.0,  # left_ankle
    50.0,  # right_ankle
    50.0,  # left_elbow
    50.0,  # right_elbow
]

The robot changed, and with it the number of joints to control or the number of rigid bodies that compose the articulation, for example. Therefore, it is also necessary to adjust other values in the environment configuration that depend on the characteristics of the robot, such as the number of elements in the observation and action space.

action_space = 19
observation_space = 69

The Code Execution#

After the minor modification has been done, and similar to the previous tutorial, we can train on the task using one of the available RL workflows for such task.

./isaaclab.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-H1-Direct-v0 --headless

When the training is finished, we can visualize the result with the following command. To stop the simulation, you can either close the window, or press Ctrl+C in the terminal where you started the simulation.

./isaaclab.sh -p source/standalone/workflows/rl_games/play.py --task Isaac-H1-Direct-v0 --num_envs 64
result of training Isaac-H1-Direct-v0 task

In this tutorial, we learnt how to make minor modifications to an existing environment without affecting the original code.

It is important to note, however, that while the changes to be made may be small, they may not always work on the first try, as there may be deeper dependencies on the original assets in the environment being modified. In these cases, it is advisable to analyze the code of the available examples in detail in order to make an appropriate adjustment.