Find How Many/What Cameras You Should Train With#

Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras, and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py script can be used to understand the difference in cameras types, as well to characterize their relative performance at different parameters such as camera quantity, image dimensions, and data types.

This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.

This utility can inject cameras into an existing task from the gym registry, which can be useful for benchmarking cameras in a specific scenario. Also, if you install pynvml, you can let this utility automatically find the maximum numbers of cameras that can run in your task environment up to a certain specified system resource utilization threshold (without training; taking zero actions at each timestep).

This guide accompanies the benchmark_cameras.py script in the scripts/benchmarks directory.

Code for benchmark_cameras.py
  1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
  2# All rights reserved.
  3#
  4# SPDX-License-Identifier: BSD-3-Clause
  5
  6"""
  7This script might help you determine how many cameras your system can realistically run
  8at different desired settings.
  9
 10You can supply different task environments to inject cameras into, or just test a sample scene.
 11Additionally, you can automatically find the maximum amount of cameras you can run a task with
 12through the auto-tune functionality.
 13
 14.. code-block:: bash
 15
 16    # Usage with GUI
 17    ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
 18
 19    # Usage with headless
 20    ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
 21
 22"""
 23
 24"""Launch Isaac Sim Simulator first."""
 25
 26import argparse
 27from collections.abc import Callable
 28
 29from isaaclab.app import AppLauncher
 30
 31# parse the arguments
 32args_cli = argparse.Namespace()
 33
 34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
 35
 36"""
 37The following arguments only need to be supplied for when one wishes
 38to try injecting cameras into their environment, and automatically determining
 39the maximum camera count.
 40"""
 41parser.add_argument(
 42    "--task",
 43    type=str,
 44    default=None,
 45    required=False,
 46    help="Supply this argument to spawn cameras within an known manager-based task environment.",
 47)
 48
 49parser.add_argument(
 50    "--autotune",
 51    default=False,
 52    action="store_true",
 53    help=(
 54        "Autotuning is only supported for provided task environments."
 55        " Supply this argument to increase the number of environments until a desired threshold is reached."
 56        "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
 57    ),
 58)
 59
 60parser.add_argument(
 61    "--task_num_cameras_per_env",
 62    type=int,
 63    default=1,
 64    help="The number of cameras per environment to use when using a known task.",
 65)
 66
 67parser.add_argument(
 68    "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
 69)
 70
 71parser.add_argument(
 72    "--autotune_max_percentage_util",
 73    nargs="+",
 74    type=float,
 75    default=[100.0, 80.0, 80.0, 80.0],
 76    required=False,
 77    help=(
 78        "The system utilization percentage thresholds to reach before an autotune is finished. "
 79        "If any one of these limits are hit, the autotune stops."
 80        "Thresholds are, in order, maximum CPU percentage utilization,"
 81        "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
 82        "amd maximum GPU memory utilization."
 83    ),
 84)
 85
 86parser.add_argument(
 87    "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
 88)
 89
 90parser.add_argument(
 91    "--autotune_camera_count_interval",
 92    type=int,
 93    default=25,
 94    help=(
 95        "The number of cameras to try to add to the environment if the current camera count"
 96        " falls within permitted system resource utilization limits."
 97    ),
 98)
 99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106    "--num_tiled_cameras",
107    type=int,
108    default=0,
109    required=False,
110    help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114    "--num_standard_cameras",
115    type=int,
116    default=0,
117    required=False,
118    help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122    "--num_ray_caster_cameras",
123    type=int,
124    default=0,
125    required=False,
126    help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130    "--tiled_camera_data_types",
131    nargs="+",
132    type=str,
133    default=["rgb", "depth"],
134    help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138    "--standard_camera_data_types",
139    nargs="+",
140    type=str,
141    default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142    help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146    "--ray_caster_camera_data_types",
147    nargs="+",
148    type=str,
149    default=["distance_to_image_plane"],
150    help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154    "--ray_caster_visible_mesh_prim_paths",
155    nargs="+",
156    type=str,
157    default=["/World/ground"],
158    help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162    "--convert_depth_to_camera_to_image_plane",
163    action="store_true",
164    default=True,
165    help=(
166        "Enable undistorting from perspective view (distance to camera data_type)"
167        "to orthogonal view (distance to plane data_type) for depth."
168        "This is currently needed to create undisorted depth images/point cloud."
169    ),
170)
171
172parser.add_argument(
173    "--keep_raw_depth",
174    dest="convert_depth_to_camera_to_image_plane",
175    action="store_false",
176    help=(
177        "Disable undistorting from perspective view (distance to camera)"
178        "to orthogonal view (distance to plane data_type) for depth."
179    ),
180)
181
182parser.add_argument(
183    "--height",
184    type=int,
185    default=120,
186    required=False,
187    help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191    "--width",
192    type=int,
193    default=140,
194    required=False,
195    help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199    "--warm_start_length",
200    type=int,
201    default=3,
202    required=False,
203    help=(
204        "Number of steps to run the sim before starting benchmark."
205        "Needed to avoid blank images at the start of the simulation."
206    ),
207)
208
209parser.add_argument(
210    "--experiment_length",
211    type=int,
212    default=15,
213    required=False,
214    help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219    "--num_objects",
220    type=int,
221    default=10,
222    required=False,
223    help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226# Benchmark arguments
227parser.add_argument(
228    "--benchmark_backend",
229    type=str,
230    default="omniperf",
231    choices=["json", "osmo", "omniperf", "summary"],
232    help="Benchmarking backend options, defaults omniperf",
233)
234parser.add_argument("--output_path", type=str, default=".", help="Path to output benchmark results.")
235
236
237AppLauncher.add_app_launcher_args(parser)
238args_cli = parser.parse_args()
239args_cli.enable_cameras = True
240
241if args_cli.autotune:
242    import pynvml
243
244if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
245    print("[WARNING]: Ray Casting is only currently supported for a single, static object")
246# launch omniverse app
247app_launcher = AppLauncher(args_cli)
248simulation_app = app_launcher.app
249
250"""Rest everything follows."""
251
252import random
253import time
254
255import gymnasium as gym
256import numpy as np
257import psutil
258import torch
259
260import isaaclab.sim as sim_utils
261from isaaclab.assets import RigidObject, RigidObjectCfg
262from isaaclab.scene.interactive_scene import InteractiveScene
263from isaaclab.sensors import (
264    Camera,
265    CameraCfg,
266    RayCasterCamera,
267    RayCasterCameraCfg,
268    TiledCamera,
269    TiledCameraCfg,
270    patterns,
271)
272from isaaclab.test.benchmark import BaseIsaacLabBenchmark, DictMeasurement, SingleMeasurement
273from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
274
275from isaaclab_tasks.utils import load_cfg_from_registry
276
277"""
278Camera Creation
279"""
280
281
282def create_camera_base(
283    camera_cfg: type[CameraCfg | TiledCameraCfg],
284    num_cams: int,
285    data_types: list[str],
286    height: int,
287    width: int,
288    prim_path: str | None = None,
289    instantiate: bool = True,
290) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
291    """Generalized function to create a camera or tiled camera sensor."""
292    # Determine prim prefix based on the camera class
293    name = camera_cfg.class_type.__name__
294
295    if instantiate:
296        # Create the necessary prims
297        for idx in range(num_cams):
298            sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
299    if prim_path is None:
300        prim_path = f"/World/{name}_.*/{name}"
301    # If valid camera settings are provided, create the camera
302    if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
303        cfg = camera_cfg(
304            prim_path=prim_path,
305            update_period=0,
306            height=height,
307            width=width,
308            data_types=data_types,
309            spawn=sim_utils.PinholeCameraCfg(
310                focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
311            ),
312        )
313        if instantiate:
314            return camera_cfg.class_type(cfg=cfg)
315        else:
316            return cfg
317    else:
318        return None
319
320
321def create_tiled_cameras(
322    num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
323) -> TiledCamera | None:
324    if data_types is None:
325        data_types = ["rgb", "depth"]
326    """Defines the tiled camera sensor to add to the scene."""
327    return create_camera_base(
328        camera_cfg=TiledCameraCfg,
329        num_cams=num_cams,
330        data_types=data_types,
331        height=height,
332        width=width,
333    )
334
335
336def create_cameras(
337    num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
338) -> Camera | None:
339    """Defines the Standard cameras."""
340    if data_types is None:
341        data_types = ["rgb", "depth"]
342    return create_camera_base(
343        camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
344    )
345
346
347def create_ray_caster_cameras(
348    num_cams: int = 2,
349    data_types: list[str] = ["distance_to_image_plane"],
350    mesh_prim_paths: list[str] = ["/World/ground"],
351    height: int = 100,
352    width: int = 120,
353    prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
354    instantiate: bool = True,
355) -> RayCasterCamera | RayCasterCameraCfg | None:
356    """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
357    for idx in range(num_cams):
358        sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
359
360    if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
361        cam_cfg = RayCasterCameraCfg(
362            prim_path=prim_path,
363            mesh_prim_paths=mesh_prim_paths,
364            update_period=0,
365            offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
366            data_types=data_types,
367            debug_vis=False,
368            pattern_cfg=patterns.PinholeCameraPatternCfg(
369                focal_length=24.0,
370                horizontal_aperture=20.955,
371                height=480,
372                width=640,
373            ),
374        )
375        if instantiate:
376            return RayCasterCamera(cfg=cam_cfg)
377        else:
378            return cam_cfg
379
380    else:
381        return None
382
383
384def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
385    """Grab a simple tiled camera config for injecting into task environments."""
386    return create_camera_base(
387        TiledCameraCfg,
388        num_cams=args_cli.num_tiled_cameras,
389        data_types=args_cli.tiled_camera_data_types,
390        width=args_cli.width,
391        height=args_cli.height,
392        prim_path="{ENV_REGEX_NS}/" + prim_path,
393        instantiate=False,
394    )
395
396
397def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
398    """Grab a simple standard camera config for injecting into task environments."""
399    return create_camera_base(
400        CameraCfg,
401        num_cams=args_cli.num_standard_cameras,
402        data_types=args_cli.standard_camera_data_types,
403        width=args_cli.width,
404        height=args_cli.height,
405        prim_path="{ENV_REGEX_NS}/" + prim_path,
406        instantiate=False,
407    )
408
409
410def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
411    """Grab a simple ray caster config for injecting into task environments."""
412    return create_ray_caster_cameras(
413        num_cams=args_cli.num_ray_caster_cameras,
414        data_types=args_cli.ray_caster_camera_data_types,
415        width=args_cli.width,
416        height=args_cli.height,
417        prim_path="{ENV_REGEX_NS}/" + prim_path,
418    )
419
420
421"""
422Scene Creation
423"""
424
425
426def design_scene(
427    num_tiled_cams: int = 2,
428    num_standard_cams: int = 0,
429    num_ray_caster_cams: int = 0,
430    tiled_camera_data_types: list[str] | None = None,
431    standard_camera_data_types: list[str] | None = None,
432    ray_caster_camera_data_types: list[str] | None = None,
433    height: int = 100,
434    width: int = 200,
435    num_objects: int = 20,
436    mesh_prim_paths: list[str] = ["/World/ground"],
437) -> dict:
438    """Design the scene."""
439    if tiled_camera_data_types is None:
440        tiled_camera_data_types = ["rgb"]
441    if standard_camera_data_types is None:
442        standard_camera_data_types = ["rgb"]
443    if ray_caster_camera_data_types is None:
444        ray_caster_camera_data_types = ["distance_to_image_plane"]
445
446    # Populate scene
447    # -- Ground-plane
448    cfg = sim_utils.GroundPlaneCfg()
449    cfg.func("/World/ground", cfg)
450    # -- Lights
451    cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
452    cfg.func("/World/Light", cfg)
453
454    # Create a dictionary for the scene entities
455    scene_entities = {}
456
457    # Xform to hold objects
458    sim_utils.create_prim("/World/Objects", "Xform")
459    # Random objects
460    for i in range(num_objects):
461        # sample random position
462        position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
463        position *= np.asarray([1.5, 1.5, 0.5])
464        # sample random color
465        color = (random.random(), random.random(), random.random())
466        # choose random prim type
467        prim_type = random.choice(["Cube", "Cone", "Cylinder"])
468        common_properties = {
469            "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
470            "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
471            "collision_props": sim_utils.CollisionPropertiesCfg(),
472            "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
473            "semantic_tags": [("class", prim_type)],
474        }
475        if prim_type == "Cube":
476            shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
477        elif prim_type == "Cone":
478            shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
479        elif prim_type == "Cylinder":
480            shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
481        # Rigid Object
482        obj_cfg = RigidObjectCfg(
483            prim_path=f"/World/Objects/Obj_{i:02d}",
484            spawn=shape_cfg,
485            init_state=RigidObjectCfg.InitialStateCfg(pos=position),
486        )
487        scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
488
489    # Sensors
490    standard_camera = create_cameras(
491        num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
492    )
493    tiled_camera = create_tiled_cameras(
494        num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
495    )
496    ray_caster_camera = create_ray_caster_cameras(
497        num_cams=num_ray_caster_cams,
498        data_types=ray_caster_camera_data_types,
499        mesh_prim_paths=mesh_prim_paths,
500        height=height,
501        width=width,
502    )
503    # return the scene information
504    if tiled_camera is not None:
505        scene_entities["tiled_camera"] = tiled_camera
506    if standard_camera is not None:
507        scene_entities["standard_camera"] = standard_camera
508    if ray_caster_camera is not None:
509        scene_entities["ray_caster_camera"] = ray_caster_camera
510    return scene_entities
511
512
513def inject_cameras_into_task(
514    task: str,
515    num_cams: int,
516    camera_name_prefix: str,
517    camera_creation_callable: Callable,
518    num_cameras_per_env: int = 1,
519) -> gym.Env:
520    """Loads the task, sticks cameras into the config, and creates the environment."""
521    cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
522    cfg.sim.device = args_cli.device
523    cfg.sim.use_fabric = args_cli.use_fabric
524    scene_cfg = cfg.scene
525
526    num_envs = int(num_cams / num_cameras_per_env)
527    scene_cfg.num_envs = num_envs
528
529    for idx in range(num_cameras_per_env):
530        suffix = "" if idx == 0 else str(idx)
531        name = camera_name_prefix + suffix
532        setattr(scene_cfg, name, camera_creation_callable(name))
533    cfg.scene = scene_cfg
534    env = gym.make(task, cfg=cfg)
535    return env
536
537
538"""
539System diagnosis
540"""
541
542
543def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
544    """Get the maximum CPU, RAM, GPU utilization (processing), and
545    GPU memory usage percentages since the last time reset was true."""
546    if reset:
547        max_values[:] = [0, 0, 0, 0]  # Reset the max values
548
549    # CPU utilization
550    cpu_usage = psutil.cpu_percent(interval=0.1)
551    max_values[0] = max(max_values[0], cpu_usage)
552
553    # RAM utilization
554    memory_info = psutil.virtual_memory()
555    ram_usage = memory_info.percent
556    max_values[1] = max(max_values[1], ram_usage)
557
558    # GPU utilization using pynvml
559    if torch.cuda.is_available():
560        if args_cli.autotune:
561            pynvml.nvmlInit()  # Initialize NVML
562            for i in range(torch.cuda.device_count()):
563                handle = pynvml.nvmlDeviceGetHandleByIndex(i)
564
565                # GPU Utilization
566                gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
567                gpu_processing_utilization_percent = gpu_utilization.gpu  # GPU core utilization
568                max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
569
570                # GPU Memory Usage
571                memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
572                gpu_memory_total = memory_info.total
573                gpu_memory_used = memory_info.used
574                gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
575                max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
576
577            pynvml.nvmlShutdown()  # Shutdown NVML after usage
578    else:
579        gpu_processing_utilization_percent = None
580        gpu_memory_utilization_percent = None
581    return max_values
582
583
584"""
585Experiment
586"""
587
588
589def run_simulator(
590    sim: sim_utils.SimulationContext | None,
591    scene_entities: dict | InteractiveScene,
592    warm_start_length: int = 10,
593    experiment_length: int = 100,
594    tiled_camera_data_types: list[str] | None = None,
595    standard_camera_data_types: list[str] | None = None,
596    ray_caster_camera_data_types: list[str] | None = None,
597    depth_predicate: Callable = lambda x: "to" in x or x == "depth",
598    perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
599    convert_depth_to_camera_to_image_plane: bool = True,
600    max_cameras_per_env: int = 1,
601    env: gym.Env | None = None,
602) -> dict:
603    """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
604
605    if tiled_camera_data_types is None:
606        tiled_camera_data_types = ["rgb"]
607    if standard_camera_data_types is None:
608        standard_camera_data_types = ["rgb"]
609    if ray_caster_camera_data_types is None:
610        ray_caster_camera_data_types = ["distance_to_image_plane"]
611
612    # Initialize camera lists
613    tiled_cameras = []
614    standard_cameras = []
615    ray_caster_cameras = []
616
617    # Dynamically extract cameras from the scene entities up to max_cameras_per_env
618    for i in range(max_cameras_per_env):
619        # Extract tiled cameras
620        tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
621        standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
622        ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
623
624        try:  # if instead you checked ... if key is in scene_entities... # errors out always even if key present
625            tiled_cameras.append(scene_entities[tiled_camera_key])
626            standard_cameras.append(scene_entities[standard_camera_key])
627            ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
628        except KeyError:
629            break
630
631    # Initialize camera counts
632    camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
633    camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
634    labels = ["tiled", "standard", "ray_caster"]
635
636    if sim is not None:
637        # Set camera world poses
638        for camera_list in camera_lists:
639            for camera in camera_list:
640                num_cameras = camera.data.intrinsic_matrices.size(0)
641                positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
642                targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
643                camera.set_world_poses_from_view(positions, targets)
644
645    # Initialize timing variables
646    timestep = 0
647    total_time = 0.0
648    valid_timesteps = 0
649    sim_step_time = 0.0
650
651    while simulation_app.is_running() and timestep < experiment_length:
652        print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
653        get_utilization_percentages()
654
655        # Measure the total simulation step time
656        step_start_time = time.time()
657
658        if sim is not None:
659            sim.step()
660
661        if env is not None:
662            with torch.inference_mode():
663                # compute zero actions
664                actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
665                # apply actions
666                env.step(actions)
667
668        # Update cameras and process vision data within the simulation step
669        clouds = {}
670        images = {}
671        depth_images = {}
672
673        # Loop through all camera lists and their data_types
674        for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
675            for cam_idx, camera in enumerate(camera_list):
676                if env is None:  # No env, need to step cams manually
677                    # Only update the camera if it hasn't been updated as part of scene_entities.update ...
678                    camera.update(dt=sim.get_physics_dt())
679
680                for data_type in data_types:
681                    data_label = f"{label}_{cam_idx}_{data_type}"
682
683                    if depth_predicate(data_type):  # is a depth image, want to create cloud
684                        depth = camera.data.output[data_type]
685                        depth_images[data_label + "_raw"] = depth
686                        if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
687                            depth = orthogonalize_perspective_depth(
688                                camera.data.output[data_type], camera.data.intrinsic_matrices
689                            )
690                            depth_images[data_label + "_undistorted"] = depth
691
692                        pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
693                        clouds[data_label] = pointcloud
694                    else:  # rgb image, just save it
695                        image = camera.data.output[data_type]
696                        images[data_label] = image
697
698        # End timing for the step
699        step_end_time = time.time()
700        sim_step_time += step_end_time - step_start_time
701
702        if timestep > warm_start_length:
703            get_utilization_percentages(reset=True)
704            total_time += step_end_time - step_start_time
705            valid_timesteps += 1
706
707        timestep += 1
708
709    # Calculate average timings
710    if valid_timesteps > 0:
711        avg_timestep_duration = total_time / valid_timesteps
712        avg_sim_step_duration = sim_step_time / experiment_length
713    else:
714        avg_timestep_duration = 0.0
715        avg_sim_step_duration = 0.0
716
717    # Package timing analytics in a dictionary
718    timing_analytics = {
719        "average_timestep_duration": avg_timestep_duration,
720        "average_sim_step_duration": avg_sim_step_duration,
721        "total_simulation_time": sim_step_time,
722        "total_experiment_duration": sim_step_time,
723    }
724
725    system_utilization_analytics = get_utilization_percentages()
726
727    print("--- Benchmark Results ---")
728    print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
729    print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
730    print(f"Total simulation time: {sim_step_time:.6f} seconds")
731    print("\nSystem Utilization Statistics:")
732    print(
733        f"| CPU:{system_utilization_analytics[0]}% | "
734        f"RAM:{system_utilization_analytics[1]}% | "
735        f"GPU Compute:{system_utilization_analytics[2]}% | "
736        f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
737    )
738
739    return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
740
741
742def main():
743    """Main function."""
744    # Load simulation context
745    if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
746        raise ValueError("You must select at least one camera.")
747    if (
748        (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
749        or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
750        or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
751    ):
752        print("[WARNING]: You have elected to use more than one camera type.")
753        print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
754        print(
755            "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
756            "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
757        )
758        raise ValueError("Benchmark one camera at a time.")
759
760    # Determine which camera type is being used
761    camera_type = "tiled"
762    num_cameras = args_cli.num_tiled_cameras
763    if args_cli.num_standard_cameras > 0:
764        camera_type = "standard"
765        num_cameras = args_cli.num_standard_cameras
766    elif args_cli.num_ray_caster_cameras > 0:
767        camera_type = "ray_caster"
768        num_cameras = args_cli.num_ray_caster_cameras
769
770    # Create the benchmark
771    backend_type = args_cli.benchmark_backend
772    benchmark = BaseIsaacLabBenchmark(
773        benchmark_name="benchmark_cameras",
774        backend_type=backend_type,
775        output_path=args_cli.output_path,
776        use_recorders=True,
777        frametime_recorders=backend_type in ("summary", "omniperf"),
778        output_prefix="benchmark_cameras",
779        workflow_metadata={
780            "metadata": [
781                {"name": "task", "data": args_cli.task},
782                {"name": "camera_type", "data": camera_type},
783                {"name": "num_cameras", "data": num_cameras},
784                {"name": "height", "data": args_cli.height},
785                {"name": "width", "data": args_cli.width},
786                {"name": "experiment_length", "data": args_cli.experiment_length},
787                {"name": "autotune", "data": args_cli.autotune},
788            ]
789        },
790    )
791
792    print("[INFO]: Designing the scene")
793    final_analysis = None
794
795    if args_cli.task is None:
796        print("[INFO]: No task environment provided, creating random scene.")
797        sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
798        sim = sim_utils.SimulationContext(sim_cfg)
799        # Set main camera
800        sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
801        scene_entities = design_scene(
802            num_tiled_cams=args_cli.num_tiled_cameras,
803            num_standard_cams=args_cli.num_standard_cameras,
804            num_ray_caster_cams=args_cli.num_ray_caster_cameras,
805            tiled_camera_data_types=args_cli.tiled_camera_data_types,
806            standard_camera_data_types=args_cli.standard_camera_data_types,
807            ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
808            height=args_cli.height,
809            width=args_cli.width,
810            num_objects=args_cli.num_objects,
811            mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
812        )
813        # Play simulator
814        sim.reset()
815        # Now we are ready!
816        print("[INFO]: Setup complete...")
817        # Run simulator
818        final_analysis = run_simulator(
819            sim=sim,
820            scene_entities=scene_entities,
821            warm_start_length=args_cli.warm_start_length,
822            experiment_length=args_cli.experiment_length,
823            tiled_camera_data_types=args_cli.tiled_camera_data_types,
824            standard_camera_data_types=args_cli.standard_camera_data_types,
825            ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
826            convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
827        )
828    else:
829        print("[INFO]: Using known task environment, injecting cameras.")
830        autotune_iter = 0
831        max_sys_util_thresh = [0.0, 0.0, 0.0]
832        max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
833        cur_num_cams = max_num_cams
834        cur_sys_util = max_sys_util_thresh
835        interval = args_cli.autotune_camera_count_interval
836
837        if args_cli.autotune:
838            max_sys_util_thresh = args_cli.autotune_max_percentage_util
839            max_num_cams = args_cli.autotune_max_camera_count
840            print("[INFO]: Auto tuning until any of the following threshold are met")
841            print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
842            print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
843        # Determine which camera is being tested...
844        tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
845        standard_camera_cfg = create_standard_camera_cfg("standard_camera")
846        ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
847        camera_name_prefix = ""
848        camera_creation_callable = None
849        num_cams = 0
850        if tiled_camera_cfg is not None:
851            camera_name_prefix = "tiled_camera"
852            camera_creation_callable = create_tiled_camera_cfg
853            num_cams = args_cli.num_tiled_cameras
854        elif standard_camera_cfg is not None:
855            camera_name_prefix = "standard_camera"
856            camera_creation_callable = create_standard_camera_cfg
857            num_cams = args_cli.num_standard_cameras
858        elif ray_caster_camera_cfg is not None:
859            camera_name_prefix = "ray_caster_camera"
860            camera_creation_callable = create_ray_caster_camera_cfg
861            num_cams = args_cli.num_ray_caster_cameras
862
863        while (
864            all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
865            and cur_num_cams <= max_num_cams
866        ):
867            cur_num_cams = num_cams + interval * autotune_iter
868            autotune_iter += 1
869
870            env = inject_cameras_into_task(
871                task=args_cli.task,
872                num_cams=cur_num_cams,
873                camera_name_prefix=camera_name_prefix,
874                camera_creation_callable=camera_creation_callable,
875                num_cameras_per_env=args_cli.task_num_cameras_per_env,
876            )
877            env.reset()
878            print(f"Testing with {cur_num_cams} {camera_name_prefix}")
879            analysis = run_simulator(
880                sim=None,
881                scene_entities=env.unwrapped.scene,
882                warm_start_length=args_cli.warm_start_length,
883                experiment_length=args_cli.experiment_length,
884                tiled_camera_data_types=args_cli.tiled_camera_data_types,
885                standard_camera_data_types=args_cli.standard_camera_data_types,
886                ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
887                convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
888                max_cameras_per_env=args_cli.task_num_cameras_per_env,
889                env=env,
890            )
891
892            cur_sys_util = analysis["system_utilization_analytics"]
893            final_analysis = analysis
894            print("Triggering reset...")
895            env.close()
896            sim_utils.create_new_stage()
897        print("[INFO]: DONE! Feel free to CTRL + C Me ")
898        print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
899        print("Keep in mind, this is without any training running on the GPU.")
900        print("Set lower utilization thresholds to account for training.")
901
902        if not args_cli.autotune:
903            print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
904
905    # Log benchmark measurements
906    if final_analysis is not None:
907        timing = final_analysis["timing_analytics"]
908        sys_util = final_analysis["system_utilization_analytics"]
909
910        # Log timing measurements
911        benchmark.add_measurement(
912            "runtime",
913            measurement=SingleMeasurement(
914                name="Average Timestep Duration", value=timing["average_timestep_duration"] * 1000, unit="ms"
915            ),
916        )
917        benchmark.add_measurement(
918            "runtime",
919            measurement=SingleMeasurement(
920                name="Average Simulation Step Duration", value=timing["average_sim_step_duration"] * 1000, unit="ms"
921            ),
922        )
923        benchmark.add_measurement(
924            "runtime",
925            measurement=SingleMeasurement(
926                name="Total Simulation Time", value=timing["total_simulation_time"] * 1000, unit="ms"
927            ),
928        )
929
930        # Log system utilization
931        benchmark.add_measurement(
932            "runtime",
933            measurement=DictMeasurement(
934                name="System Utilization",
935                value={
936                    "cpu_percent": sys_util[0],
937                    "ram_percent": sys_util[1],
938                    "gpu_compute_percent": sys_util[2],
939                    "gpu_memory_percent": sys_util[3],
940                },
941            ),
942        )
943
944    # Finalize benchmark
945    benchmark.update_manual_recorders()
946    benchmark._finalize_impl()
947
948
949if __name__ == "__main__":
950    # run the main function
951    main()
952    # close sim app
953    simulation_app.close()

Possible Parameters#

First, run

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h

to see all possible parameters you can vary with this utility.

See the command line parameters related to autotune for more information about automatically determining maximum camera count.

Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#

Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.

For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb

If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml), you can also find the maximum number of cameras that you could run in the specified environment up to a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent, max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras you can run with cartpole, you could run:

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50

Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.

The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.

Compare Camera Type and Performance (Without a Specified Task)#

This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100

If your system cannot handle this due to performance reasons, then the process will be killed. It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop and nvtop to live monitor resources while running this script, and in Windows, you can use the Task Manager.

If your system has a hard time handling the desired cameras, you can try the following

  • Switch to headless mode (supply --headless)

  • Ensure you are using the GPU pipeline not CPU!

  • If you aren’t using Tiled Cameras, switch to Tiled Cameras

  • Decrease camera resolution

  • Decrease how many data_types there are for each camera.

  • Decrease the number of cameras

  • Decrease the number of objects in the scene

If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.