Environment Setup and Validation#

Docker Container: Base (see Docker Containers for more details)

On this page we briefly describe the RL environment used in this example workflow and validate that we can load it in Isaac Lab.

./docker/run_docker.sh

Environment Description#

Step-by-Step Breakdown#

1. Interact with the Asset Registry

background = self.asset_registry.get_asset_by_name("table")()
pick_up_object = self.asset_registry.get_asset_by_name(args_cli.object)()
ground_plane = self.asset_registry.get_asset_by_name("ground_plane")()
light = self.asset_registry.get_asset_by_name("light")()

embodiment = self.asset_registry.get_asset_by_name(args_cli.embodiment)(
    concatenate_observation_terms=True
)

Here, we’re selecting the components needed for our RL task: a table as our support surface, an object to lift (configurable via CLI, default is dex_cube), a ground plane for physics, and lighting for visualization. The Franka embodiment is configured with concatenate_observation_terms=True to provide a flat observation vector suitable for RL training.

2. Position the Objects

background.set_initial_pose(Pose(position_xyz=(0.5, 0, 0), rotation_wxyz=(0.707, 0, 0, 0.707)))
pick_up_object.set_initial_pose(Pose(position_xyz=(0.5, 0, 0.055), rotation_wxyz=(1, 0, 0, 0)))
ground_plane.set_initial_pose(Pose(position_xyz=(0.0, 0.0, -1.05)))

Before we create the scene, we need to place our objects in the right locations. The table is positioned at (0.5, 0, 0), the object is placed on top of the table at a height of 0.055m, and the ground plane is positioned below to provide physical support.

3. Compose the Scene

scene = Scene(assets=assets)

Now we bring everything together into an IsaacLab-Arena scene. See Scene Design for scene composition details.

4. Create the Lift Object RL Task

task = LiftObjectTaskRL(
    pick_up_object,
    background,
    embodiment,
    minimum_height_to_lift=0.04,
    episode_length_s=5.0,
    rl_training_mode=args_cli.rl_training_mode,
)

The LiftObjectTaskRL encapsulates the RL training objective: lift the object to commanded target positions. The task includes:

Command Manager: Samples random target positions within a configurable range
Reward Terms: Dense rewards for reaching, grasping, lifting, and achieving target poses
Observation Space: Robot state (joint positions, velocities), object state (pose, velocity), and goal commands
Termination Conditions: Object dropped or timeout
Success Condition: Object reaches target position (only active during evaluation when rl_training_mode=False)

See Tasks Design for task creation details.

5. Create the IsaacLab Arena Environment

isaaclab_arena_environment = IsaacLabArenaEnvironment(
    name=self.name,
    embodiment=embodiment,
    scene=scene,
    task=task,
    teleop_device=None,
)

Finally, we assemble all the pieces into a complete, runnable RL environment. The IsaacLabArenaEnvironment connects the embodiment (the robot), the scene (the world), and the task (the objective and rewards). See Environment Design for environment composition details.

Validation: Run Random Policy#

To validate the environment loads correctly, run one training iteration and check for errors:

python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
  --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
  --task lift_object \
  --num_envs 64 \
  --max_iterations 1 \
  --headless

If the environment is set up correctly, you will see one iteration of training output before the script exits.

You should see output indicating the start of training:

Learning iteration 0/1

                          Computation: 308 steps/s (collection: 4.600s, learning 0.377s)
                Mean action noise std: 1.00
             Mean value_function loss: 0.0273
                  Mean surrogate loss: -0.0138
                    Mean entropy loss: 9.9339
                          Mean reward: 0.65
                  Mean episode length: 12.00
           Episode_Reward/action_rate: -0.0000
             Episode_Reward/joint_vel: -0.0001
       Episode_Reward/reaching_object: 0.0000
        Episode_Reward/lifting_object: 0.1050
   Episode_Reward/object_goal_tracking: 0.0223
   Episode_Reward/object_goal_tracking_fine_grained: 0.0000
   Metrics/object_pose/position_error: 0.5721
   Metrics/object_pose/orientation_error: 2.2834
         Episode_Termination/time_out: 0.0423
   Episode_Termination/object_dropped: 0.0000
          Episode_Termination/success: 0.0000
================================================================================
                      Total timesteps: 1536
                       Iteration time: 4.98s
                         Time elapsed: 00:00:04
                                  ETA: 00:00:49