Warp Environment Migration Guide#
This guide covers the key conventions and patterns used by the warp-first environment infrastructure, useful for migrating existing torch environments or creating new ones natively. For an overview of the warp env path itself (workflows, available envs, performance, limitations, benchmarking), see Warp Experimental Environments.
Design Rationale#
The warp environment path is built around CUDA graph capture.
A CUDA graph records a sequence of GPU operations (kernel launches, memory copies) during a
capture phase, then replays the entire sequence with a single launch. This eliminates per-kernel
CPU overhead — the parameter validation, kernel selection, and buffer setup that normally costs
20–200 μs per operation is performed once during graph instantiation and reused on every replay
(~10 μs total). All CPU-side code (Python logic, torch dispatching) executed during capture is
completely bypassed during replay. See the Warp concurrency documentation for Warp’s graph capture API
(wp.ScopedCapture).
All design decisions in the warp infrastructure follow from this constraint: every operation in the step loop must be a GPU kernel launch with stable memory pointers so that the captured graph can be replayed without modification.
Key consequences:
All buffers are pre-allocated — no dynamic allocation inside the step loop
Data flows through persistent ``wp.array`` pointers — never replaced, only overwritten
MDP terms are pure ``@wp.kernel`` functions — no Python branching on GPU data
Reset uses boolean masks (
env_mask) instead of index lists (env_ids) to avoid variable-length indexing that changes graph topology
Project Structure#
Warp-specific implementations that diverge from the torch-based managers and env classes live in the _experimental packages:
isaaclab_experimental— warp managers, base env classes, warp MDP termsisaaclab_tasks_experimental— warp task configs and task-specific MDP terms
Any new warp implementation that differs from the torch-based managers or env classes belongs in these packages.
Warp task configs reference Newton physics directly (no PresetCfg) since the warp path
is Newton-only.
Writing Warp MDP Terms#
Imports#
Warp task configs import from the experimental packages:
# Warp
from isaaclab_experimental.managers import ObservationTermCfg, RewardTermCfg, SceneEntityCfg
import isaaclab_experimental.envs.mdp as mdp
The term config classes have the same interface — only the import path changes.
Common Pattern#
All warp MDP terms (observations, rewards, terminations, events, actions) follow the same
kernel + launch pattern. Torch terms use torch tensors and return results; warp terms
write into pre-allocated wp.array output buffers via @wp.kernel functions:
# Torch — returns a tensor
def lin_vel_z_l2(env, asset_cfg) -> torch.Tensor:
return torch.square(asset.data.root_lin_vel_b[:, 2])
# Warp — writes into pre-allocated output
@wp.kernel
def _lin_vel_z_l2_kernel(vel: wp.array(...), out: wp.array(dtype=wp.float32)):
i = wp.tid()
out[i] = vel[i][2] * vel[i][2]
def lin_vel_z_l2(env, out, asset_cfg) -> None:
wp.launch(_lin_vel_z_l2_kernel, dim=env.num_envs, inputs=[..., out])
The output buffer shapes differ by term type:
Observations:
(num_envs, D)where D is the observation dimensionRewards:
(num_envs,)Terminations:
(num_envs,)with dtypeboolEvents:
(num_envs,)mask — events don’t produce output, they modify sim state
Observation Terms#
Since warp terms write into pre-allocated buffers, the observation manager must know each
term’s output dimension at initialization to allocate the correct (num_envs, D) output
array. This is resolved via a fallback chain (see
ObservationManager._infer_term_dim_scalar in
isaaclab_experimental/managers/observation_manager.py):
Explicit ``out_dim`` in decorator (preferred):
@generic_io_descriptor_warp(out_dim=3, observation_type="RootState") def base_lin_vel(env, out, asset_cfg) -> None: ...
out_dimcan be an integer, or a string that resolves at initialization:"joint"— number of selected joints fromasset_cfg"body:N"— N components per selected body fromasset_cfg"command"— dimension from command manager"action"— dimension from action manager
``axes`` metadata: Dimension equals the number of axes listed:
@generic_io_descriptor_warp(axes=["X", "Y", "Z"], observation_type="RootState") def projected_gravity(env, out, asset_cfg) -> None: ... # → dimension = 3
Legacy params:
term_dim,out_dim, orobs_dimkeys interm_cfg.params.Asset config fallback: Count of
asset_cfg.joint_ids(orjoint_ids_wp) for joint-level terms.
Event Terms#
Events use env_mask (boolean wp.array) instead of env_ids, and each kernel
checks the mask to skip non-selected environments:
def reset_joints_by_offset(env, env_mask, ...):
wp.launch(_kernel, dim=env.num_envs, inputs=[env_mask, ...])
@wp.kernel
def _kernel(env_mask: wp.array(dtype=wp.bool), ...):
i = wp.tid()
if not env_mask[i]:
return
# ... modify state for selected envs only
RNG uses per-env
env.rng_state_wp(wp.uint32) instead oftorch.randStartup/prestartup events use the torch convention
(env, env_ids, **params)Reset/interval events use the warp convention
(env, env_mask, **params)
Action Terms#
Actions follow a two-stage execution: process_actions (called once per env step) scales
and clips raw actions, and apply_actions (called once per sim step) writes targets to the
asset. Both stages use warp kernels with pre-allocated _raw_actions and _processed_actions
buffers.
Capture Safety#
When writing terms that run inside the captured step loop, keep in mind:
No ``wp.to_torch`` or torch arithmetic — stay in warp throughout
No lazy-evaluated properties — use sim-bound (Tier 1) data directly; if a derived quantity is needed, compute it inline in the kernel
No dynamic allocation — all buffers must be pre-allocated in
__init__
Parity Testing#
Two levels of parity testing are used to validate warp terms:
1. Implementation parity (torch vs warp) — verifies that the warp kernel produces the same result as the torch implementation. This is optional for terms that have no torch counterpart (e.g. new terms written directly in warp).
import isaaclab.envs.mdp.observations as torch_obs
import isaaclab_experimental.envs.mdp.observations as warp_obs
# Torch baseline
expected = torch_obs.joint_pos(torch_env, asset_cfg=cfg)
# Warp (uncaptured)
out = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device)
warp_obs.joint_pos(warp_env, out, asset_cfg=cfg)
actual = wp.to_torch(out)
torch.testing.assert_close(actual, expected)
2. Capture parity (warp vs warp-captured) — verifies that the term produces identical results when replayed from a CUDA graph vs launched directly. A mismatch here indicates capture-unsafe code (e.g. stale pointers, dynamic allocation, or lazy property access that doesn’t replay). This test should always be run, even for terms without a torch counterpart.
# Warp uncaptured
out_uncaptured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device)
warp_obs.joint_pos(warp_env, out_uncaptured, asset_cfg=cfg)
# Warp captured (graph replay)
out_captured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device)
with wp.ScopedCapture() as cap:
warp_obs.joint_pos(warp_env, out_captured, asset_cfg=cfg)
wp.capture_launch(cap.graph)
torch.testing.assert_close(wp.to_torch(out_captured), wp.to_torch(out_uncaptured))
See source/isaaclab_experimental/test/envs/mdp/ for complete parity test examples.
Available Warp MDP Terms#
Category |
Available Terms |
|---|---|
Observations (11) |
base_pos_zbase_lin_velbase_ang_velprojected_gravityjoint_posjoint_pos_reljoint_pos_limit_normalizedjoint_veljoint_vel_rellast_actiongenerated_commands |
Rewards (16) |
is_aliveis_terminatedlin_vel_z_l2ang_vel_xy_l2flat_orientation_l2joint_torques_l2joint_vel_l1joint_vel_l2joint_acc_l2joint_deviation_l1joint_pos_limitsaction_rate_l2action_l2undesired_contactstrack_lin_vel_xy_exptrack_ang_vel_z_exp |
Events (6) |
reset_joints_by_offsetreset_joints_by_scalereset_root_state_uniformpush_by_setting_velocityapply_external_force_torquerandomize_rigid_body_com |
Terminations (4) |
time_outroot_height_below_minimumjoint_pos_out_of_manual_limitillegal_contact |
Actions (2) |
JointPositionActionJointEffortAction |
Terms not listed here remain in torch only. When using an env that requires unlisted terms, those terms must be implemented in warp first.