Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py script in the scripts/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings.
9
10You can supply different task environments to inject cameras into, or just test a sample scene.
11Additionally, you can automatically find the maximum amount of cameras you can run a task with
12through the auto-tune functionality.
13
14.. code-block:: bash
15
16 # Usage with GUI
17 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
18
19 # Usage with headless
20 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
21
22"""
23
24"""Launch Isaac Sim Simulator first."""
25
26import argparse
27from collections.abc import Callable
28
29from isaaclab.app import AppLauncher
30
31# parse the arguments
32args_cli = argparse.Namespace()
33
34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
35
36"""
37The following arguments only need to be supplied for when one wishes
38to try injecting cameras into their environment, and automatically determining
39the maximum camera count.
40"""
41parser.add_argument(
42 "--task",
43 type=str,
44 default=None,
45 required=False,
46 help="Supply this argument to spawn cameras within an known manager-based task environment.",
47)
48
49parser.add_argument(
50 "--autotune",
51 default=False,
52 action="store_true",
53 help=(
54 "Autotuning is only supported for provided task environments."
55 " Supply this argument to increase the number of environments until a desired threshold is reached."
56 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
57 ),
58)
59
60parser.add_argument(
61 "--task_num_cameras_per_env",
62 type=int,
63 default=1,
64 help="The number of cameras per environment to use when using a known task.",
65)
66
67parser.add_argument(
68 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
69)
70
71parser.add_argument(
72 "--autotune_max_percentage_util",
73 nargs="+",
74 type=float,
75 default=[100.0, 80.0, 80.0, 80.0],
76 required=False,
77 help=(
78 "The system utilization percentage thresholds to reach before an autotune is finished. "
79 "If any one of these limits are hit, the autotune stops."
80 "Thresholds are, in order, maximum CPU percentage utilization,"
81 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
82 "amd maximum GPU memory utilization."
83 ),
84)
85
86parser.add_argument(
87 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
88)
89
90parser.add_argument(
91 "--autotune_camera_count_interval",
92 type=int,
93 default=25,
94 help=(
95 "The number of cameras to try to add to the environment if the current camera count"
96 " falls within permitted system resource utilization limits."
97 ),
98)
99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106 "--num_tiled_cameras",
107 type=int,
108 default=0,
109 required=False,
110 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114 "--num_standard_cameras",
115 type=int,
116 default=0,
117 required=False,
118 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122 "--num_ray_caster_cameras",
123 type=int,
124 default=0,
125 required=False,
126 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130 "--tiled_camera_data_types",
131 nargs="+",
132 type=str,
133 default=["rgb", "depth"],
134 help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138 "--standard_camera_data_types",
139 nargs="+",
140 type=str,
141 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142 help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146 "--ray_caster_camera_data_types",
147 nargs="+",
148 type=str,
149 default=["distance_to_image_plane"],
150 help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154 "--ray_caster_visible_mesh_prim_paths",
155 nargs="+",
156 type=str,
157 default=["/World/ground"],
158 help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162 "--convert_depth_to_camera_to_image_plane",
163 action="store_true",
164 default=True,
165 help=(
166 "Enable undistorting from perspective view (distance to camera data_type)"
167 "to orthogonal view (distance to plane data_type) for depth."
168 "This is currently needed to create undisorted depth images/point cloud."
169 ),
170)
171
172parser.add_argument(
173 "--keep_raw_depth",
174 dest="convert_depth_to_camera_to_image_plane",
175 action="store_false",
176 help=(
177 "Disable undistorting from perspective view (distance to camera)"
178 "to orthogonal view (distance to plane data_type) for depth."
179 ),
180)
181
182parser.add_argument(
183 "--height",
184 type=int,
185 default=120,
186 required=False,
187 help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191 "--width",
192 type=int,
193 default=140,
194 required=False,
195 help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199 "--warm_start_length",
200 type=int,
201 default=3,
202 required=False,
203 help=(
204 "Number of steps to run the sim before starting benchmark."
205 "Needed to avoid blank images at the start of the simulation."
206 ),
207)
208
209parser.add_argument(
210 "--experiment_length",
211 type=int,
212 default=15,
213 required=False,
214 help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219 "--num_objects",
220 type=int,
221 default=10,
222 required=False,
223 help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226# Benchmark arguments
227parser.add_argument(
228 "--benchmark_backend",
229 type=str,
230 default="omniperf",
231 choices=["json", "osmo", "omniperf", "summary"],
232 help="Benchmarking backend options, defaults omniperf",
233)
234parser.add_argument("--output_path", type=str, default=".", help="Path to output benchmark results.")
235
236
237AppLauncher.add_app_launcher_args(parser)
238args_cli = parser.parse_args()
239args_cli.enable_cameras = True
240
241if args_cli.autotune:
242 import pynvml
243
244if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
245 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
246# launch omniverse app
247app_launcher = AppLauncher(args_cli)
248simulation_app = app_launcher.app
249
250"""Rest everything follows."""
251
252import random
253import time
254
255import gymnasium as gym
256import numpy as np
257import psutil
258import torch
259
260import isaaclab.sim as sim_utils
261from isaaclab.assets import RigidObject, RigidObjectCfg
262from isaaclab.scene.interactive_scene import InteractiveScene
263from isaaclab.sensors import (
264 Camera,
265 CameraCfg,
266 RayCasterCamera,
267 RayCasterCameraCfg,
268 TiledCamera,
269 TiledCameraCfg,
270 patterns,
271)
272from isaaclab.test.benchmark import BaseIsaacLabBenchmark, DictMeasurement, SingleMeasurement
273from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
274
275from isaaclab_tasks.utils import load_cfg_from_registry
276
277"""
278Camera Creation
279"""
280
281
282def create_camera_base(
283 camera_cfg: type[CameraCfg | TiledCameraCfg],
284 num_cams: int,
285 data_types: list[str],
286 height: int,
287 width: int,
288 prim_path: str | None = None,
289 instantiate: bool = True,
290) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
291 """Generalized function to create a camera or tiled camera sensor."""
292 # Determine prim prefix based on the camera class
293 name = camera_cfg.class_type.__name__
294
295 if instantiate:
296 # Create the necessary prims
297 for idx in range(num_cams):
298 sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
299 if prim_path is None:
300 prim_path = f"/World/{name}_.*/{name}"
301 # If valid camera settings are provided, create the camera
302 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
303 cfg = camera_cfg(
304 prim_path=prim_path,
305 update_period=0,
306 height=height,
307 width=width,
308 data_types=data_types,
309 spawn=sim_utils.PinholeCameraCfg(
310 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
311 ),
312 )
313 if instantiate:
314 return camera_cfg.class_type(cfg=cfg)
315 else:
316 return cfg
317 else:
318 return None
319
320
321def create_tiled_cameras(
322 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
323) -> TiledCamera | None:
324 if data_types is None:
325 data_types = ["rgb", "depth"]
326 """Defines the tiled camera sensor to add to the scene."""
327 return create_camera_base(
328 camera_cfg=TiledCameraCfg,
329 num_cams=num_cams,
330 data_types=data_types,
331 height=height,
332 width=width,
333 )
334
335
336def create_cameras(
337 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
338) -> Camera | None:
339 """Defines the Standard cameras."""
340 if data_types is None:
341 data_types = ["rgb", "depth"]
342 return create_camera_base(
343 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
344 )
345
346
347def create_ray_caster_cameras(
348 num_cams: int = 2,
349 data_types: list[str] = ["distance_to_image_plane"],
350 mesh_prim_paths: list[str] = ["/World/ground"],
351 height: int = 100,
352 width: int = 120,
353 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
354 instantiate: bool = True,
355) -> RayCasterCamera | RayCasterCameraCfg | None:
356 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
357 for idx in range(num_cams):
358 sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
359
360 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
361 cam_cfg = RayCasterCameraCfg(
362 prim_path=prim_path,
363 mesh_prim_paths=mesh_prim_paths,
364 update_period=0,
365 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
366 data_types=data_types,
367 debug_vis=False,
368 pattern_cfg=patterns.PinholeCameraPatternCfg(
369 focal_length=24.0,
370 horizontal_aperture=20.955,
371 height=480,
372 width=640,
373 ),
374 )
375 if instantiate:
376 return RayCasterCamera(cfg=cam_cfg)
377 else:
378 return cam_cfg
379
380 else:
381 return None
382
383
384def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
385 """Grab a simple tiled camera config for injecting into task environments."""
386 return create_camera_base(
387 TiledCameraCfg,
388 num_cams=args_cli.num_tiled_cameras,
389 data_types=args_cli.tiled_camera_data_types,
390 width=args_cli.width,
391 height=args_cli.height,
392 prim_path="{ENV_REGEX_NS}/" + prim_path,
393 instantiate=False,
394 )
395
396
397def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
398 """Grab a simple standard camera config for injecting into task environments."""
399 return create_camera_base(
400 CameraCfg,
401 num_cams=args_cli.num_standard_cameras,
402 data_types=args_cli.standard_camera_data_types,
403 width=args_cli.width,
404 height=args_cli.height,
405 prim_path="{ENV_REGEX_NS}/" + prim_path,
406 instantiate=False,
407 )
408
409
410def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
411 """Grab a simple ray caster config for injecting into task environments."""
412 return create_ray_caster_cameras(
413 num_cams=args_cli.num_ray_caster_cameras,
414 data_types=args_cli.ray_caster_camera_data_types,
415 width=args_cli.width,
416 height=args_cli.height,
417 prim_path="{ENV_REGEX_NS}/" + prim_path,
418 )
419
420
421"""
422Scene Creation
423"""
424
425
426def design_scene(
427 num_tiled_cams: int = 2,
428 num_standard_cams: int = 0,
429 num_ray_caster_cams: int = 0,
430 tiled_camera_data_types: list[str] | None = None,
431 standard_camera_data_types: list[str] | None = None,
432 ray_caster_camera_data_types: list[str] | None = None,
433 height: int = 100,
434 width: int = 200,
435 num_objects: int = 20,
436 mesh_prim_paths: list[str] = ["/World/ground"],
437) -> dict:
438 """Design the scene."""
439 if tiled_camera_data_types is None:
440 tiled_camera_data_types = ["rgb"]
441 if standard_camera_data_types is None:
442 standard_camera_data_types = ["rgb"]
443 if ray_caster_camera_data_types is None:
444 ray_caster_camera_data_types = ["distance_to_image_plane"]
445
446 # Populate scene
447 # -- Ground-plane
448 cfg = sim_utils.GroundPlaneCfg()
449 cfg.func("/World/ground", cfg)
450 # -- Lights
451 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
452 cfg.func("/World/Light", cfg)
453
454 # Create a dictionary for the scene entities
455 scene_entities = {}
456
457 # Xform to hold objects
458 sim_utils.create_prim("/World/Objects", "Xform")
459 # Random objects
460 for i in range(num_objects):
461 # sample random position
462 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
463 position *= np.asarray([1.5, 1.5, 0.5])
464 # sample random color
465 color = (random.random(), random.random(), random.random())
466 # choose random prim type
467 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
468 common_properties = {
469 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
470 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
471 "collision_props": sim_utils.CollisionPropertiesCfg(),
472 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
473 "semantic_tags": [("class", prim_type)],
474 }
475 if prim_type == "Cube":
476 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
477 elif prim_type == "Cone":
478 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
479 elif prim_type == "Cylinder":
480 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
481 # Rigid Object
482 obj_cfg = RigidObjectCfg(
483 prim_path=f"/World/Objects/Obj_{i:02d}",
484 spawn=shape_cfg,
485 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
486 )
487 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
488
489 # Sensors
490 standard_camera = create_cameras(
491 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
492 )
493 tiled_camera = create_tiled_cameras(
494 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
495 )
496 ray_caster_camera = create_ray_caster_cameras(
497 num_cams=num_ray_caster_cams,
498 data_types=ray_caster_camera_data_types,
499 mesh_prim_paths=mesh_prim_paths,
500 height=height,
501 width=width,
502 )
503 # return the scene information
504 if tiled_camera is not None:
505 scene_entities["tiled_camera"] = tiled_camera
506 if standard_camera is not None:
507 scene_entities["standard_camera"] = standard_camera
508 if ray_caster_camera is not None:
509 scene_entities["ray_caster_camera"] = ray_caster_camera
510 return scene_entities
511
512
513def inject_cameras_into_task(
514 task: str,
515 num_cams: int,
516 camera_name_prefix: str,
517 camera_creation_callable: Callable,
518 num_cameras_per_env: int = 1,
519) -> gym.Env:
520 """Loads the task, sticks cameras into the config, and creates the environment."""
521 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
522 cfg.sim.device = args_cli.device
523 cfg.sim.use_fabric = args_cli.use_fabric
524 scene_cfg = cfg.scene
525
526 num_envs = int(num_cams / num_cameras_per_env)
527 scene_cfg.num_envs = num_envs
528
529 for idx in range(num_cameras_per_env):
530 suffix = "" if idx == 0 else str(idx)
531 name = camera_name_prefix + suffix
532 setattr(scene_cfg, name, camera_creation_callable(name))
533 cfg.scene = scene_cfg
534 env = gym.make(task, cfg=cfg)
535 return env
536
537
538"""
539System diagnosis
540"""
541
542
543def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
544 """Get the maximum CPU, RAM, GPU utilization (processing), and
545 GPU memory usage percentages since the last time reset was true."""
546 if reset:
547 max_values[:] = [0, 0, 0, 0] # Reset the max values
548
549 # CPU utilization
550 cpu_usage = psutil.cpu_percent(interval=0.1)
551 max_values[0] = max(max_values[0], cpu_usage)
552
553 # RAM utilization
554 memory_info = psutil.virtual_memory()
555 ram_usage = memory_info.percent
556 max_values[1] = max(max_values[1], ram_usage)
557
558 # GPU utilization using pynvml
559 if torch.cuda.is_available():
560 if args_cli.autotune:
561 pynvml.nvmlInit() # Initialize NVML
562 for i in range(torch.cuda.device_count()):
563 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
564
565 # GPU Utilization
566 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
567 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
568 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
569
570 # GPU Memory Usage
571 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
572 gpu_memory_total = memory_info.total
573 gpu_memory_used = memory_info.used
574 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
575 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
576
577 pynvml.nvmlShutdown() # Shutdown NVML after usage
578 else:
579 gpu_processing_utilization_percent = None
580 gpu_memory_utilization_percent = None
581 return max_values
582
583
584"""
585Experiment
586"""
587
588
589def run_simulator(
590 sim: sim_utils.SimulationContext | None,
591 scene_entities: dict | InteractiveScene,
592 warm_start_length: int = 10,
593 experiment_length: int = 100,
594 tiled_camera_data_types: list[str] | None = None,
595 standard_camera_data_types: list[str] | None = None,
596 ray_caster_camera_data_types: list[str] | None = None,
597 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
598 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
599 convert_depth_to_camera_to_image_plane: bool = True,
600 max_cameras_per_env: int = 1,
601 env: gym.Env | None = None,
602) -> dict:
603 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
604
605 if tiled_camera_data_types is None:
606 tiled_camera_data_types = ["rgb"]
607 if standard_camera_data_types is None:
608 standard_camera_data_types = ["rgb"]
609 if ray_caster_camera_data_types is None:
610 ray_caster_camera_data_types = ["distance_to_image_plane"]
611
612 # Initialize camera lists
613 tiled_cameras = []
614 standard_cameras = []
615 ray_caster_cameras = []
616
617 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
618 for i in range(max_cameras_per_env):
619 # Extract tiled cameras
620 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
621 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
622 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
623
624 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
625 tiled_cameras.append(scene_entities[tiled_camera_key])
626 standard_cameras.append(scene_entities[standard_camera_key])
627 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
628 except KeyError:
629 break
630
631 # Initialize camera counts
632 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
633 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
634 labels = ["tiled", "standard", "ray_caster"]
635
636 if sim is not None:
637 # Set camera world poses
638 for camera_list in camera_lists:
639 for camera in camera_list:
640 num_cameras = camera.data.intrinsic_matrices.size(0)
641 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
642 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
643 camera.set_world_poses_from_view(positions, targets)
644
645 # Initialize timing variables
646 timestep = 0
647 total_time = 0.0
648 valid_timesteps = 0
649 sim_step_time = 0.0
650
651 while simulation_app.is_running() and timestep < experiment_length:
652 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
653 get_utilization_percentages()
654
655 # Measure the total simulation step time
656 step_start_time = time.time()
657
658 if sim is not None:
659 sim.step()
660
661 if env is not None:
662 with torch.inference_mode():
663 # compute zero actions
664 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
665 # apply actions
666 env.step(actions)
667
668 # Update cameras and process vision data within the simulation step
669 clouds = {}
670 images = {}
671 depth_images = {}
672
673 # Loop through all camera lists and their data_types
674 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
675 for cam_idx, camera in enumerate(camera_list):
676 if env is None: # No env, need to step cams manually
677 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
678 camera.update(dt=sim.get_physics_dt())
679
680 for data_type in data_types:
681 data_label = f"{label}_{cam_idx}_{data_type}"
682
683 if depth_predicate(data_type): # is a depth image, want to create cloud
684 depth = camera.data.output[data_type]
685 depth_images[data_label + "_raw"] = depth
686 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
687 depth = orthogonalize_perspective_depth(
688 camera.data.output[data_type], camera.data.intrinsic_matrices
689 )
690 depth_images[data_label + "_undistorted"] = depth
691
692 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
693 clouds[data_label] = pointcloud
694 else: # rgb image, just save it
695 image = camera.data.output[data_type]
696 images[data_label] = image
697
698 # End timing for the step
699 step_end_time = time.time()
700 sim_step_time += step_end_time - step_start_time
701
702 if timestep > warm_start_length:
703 get_utilization_percentages(reset=True)
704 total_time += step_end_time - step_start_time
705 valid_timesteps += 1
706
707 timestep += 1
708
709 # Calculate average timings
710 if valid_timesteps > 0:
711 avg_timestep_duration = total_time / valid_timesteps
712 avg_sim_step_duration = sim_step_time / experiment_length
713 else:
714 avg_timestep_duration = 0.0
715 avg_sim_step_duration = 0.0
716
717 # Package timing analytics in a dictionary
718 timing_analytics = {
719 "average_timestep_duration": avg_timestep_duration,
720 "average_sim_step_duration": avg_sim_step_duration,
721 "total_simulation_time": sim_step_time,
722 "total_experiment_duration": sim_step_time,
723 }
724
725 system_utilization_analytics = get_utilization_percentages()
726
727 print("--- Benchmark Results ---")
728 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
729 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
730 print(f"Total simulation time: {sim_step_time:.6f} seconds")
731 print("\nSystem Utilization Statistics:")
732 print(
733 f"| CPU:{system_utilization_analytics[0]}% | "
734 f"RAM:{system_utilization_analytics[1]}% | "
735 f"GPU Compute:{system_utilization_analytics[2]}% | "
736 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
737 )
738
739 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
740
741
742def main():
743 """Main function."""
744 # Load simulation context
745 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
746 raise ValueError("You must select at least one camera.")
747 if (
748 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
749 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
750 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
751 ):
752 print("[WARNING]: You have elected to use more than one camera type.")
753 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
754 print(
755 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
756 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
757 )
758 raise ValueError("Benchmark one camera at a time.")
759
760 # Determine which camera type is being used
761 camera_type = "tiled"
762 num_cameras = args_cli.num_tiled_cameras
763 if args_cli.num_standard_cameras > 0:
764 camera_type = "standard"
765 num_cameras = args_cli.num_standard_cameras
766 elif args_cli.num_ray_caster_cameras > 0:
767 camera_type = "ray_caster"
768 num_cameras = args_cli.num_ray_caster_cameras
769
770 # Create the benchmark
771 backend_type = args_cli.benchmark_backend
772 benchmark = BaseIsaacLabBenchmark(
773 benchmark_name="benchmark_cameras",
774 backend_type=backend_type,
775 output_path=args_cli.output_path,
776 use_recorders=True,
777 frametime_recorders=backend_type in ("summary", "omniperf"),
778 output_prefix="benchmark_cameras",
779 workflow_metadata={
780 "metadata": [
781 {"name": "task", "data": args_cli.task},
782 {"name": "camera_type", "data": camera_type},
783 {"name": "num_cameras", "data": num_cameras},
784 {"name": "height", "data": args_cli.height},
785 {"name": "width", "data": args_cli.width},
786 {"name": "experiment_length", "data": args_cli.experiment_length},
787 {"name": "autotune", "data": args_cli.autotune},
788 ]
789 },
790 )
791
792 print("[INFO]: Designing the scene")
793 final_analysis = None
794
795 if args_cli.task is None:
796 print("[INFO]: No task environment provided, creating random scene.")
797 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
798 sim = sim_utils.SimulationContext(sim_cfg)
799 # Set main camera
800 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
801 scene_entities = design_scene(
802 num_tiled_cams=args_cli.num_tiled_cameras,
803 num_standard_cams=args_cli.num_standard_cameras,
804 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
805 tiled_camera_data_types=args_cli.tiled_camera_data_types,
806 standard_camera_data_types=args_cli.standard_camera_data_types,
807 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
808 height=args_cli.height,
809 width=args_cli.width,
810 num_objects=args_cli.num_objects,
811 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
812 )
813 # Play simulator
814 sim.reset()
815 # Now we are ready!
816 print("[INFO]: Setup complete...")
817 # Run simulator
818 final_analysis = run_simulator(
819 sim=sim,
820 scene_entities=scene_entities,
821 warm_start_length=args_cli.warm_start_length,
822 experiment_length=args_cli.experiment_length,
823 tiled_camera_data_types=args_cli.tiled_camera_data_types,
824 standard_camera_data_types=args_cli.standard_camera_data_types,
825 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
826 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
827 )
828 else:
829 print("[INFO]: Using known task environment, injecting cameras.")
830 autotune_iter = 0
831 max_sys_util_thresh = [0.0, 0.0, 0.0]
832 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
833 cur_num_cams = max_num_cams
834 cur_sys_util = max_sys_util_thresh
835 interval = args_cli.autotune_camera_count_interval
836
837 if args_cli.autotune:
838 max_sys_util_thresh = args_cli.autotune_max_percentage_util
839 max_num_cams = args_cli.autotune_max_camera_count
840 print("[INFO]: Auto tuning until any of the following threshold are met")
841 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
842 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
843 # Determine which camera is being tested...
844 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
845 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
846 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
847 camera_name_prefix = ""
848 camera_creation_callable = None
849 num_cams = 0
850 if tiled_camera_cfg is not None:
851 camera_name_prefix = "tiled_camera"
852 camera_creation_callable = create_tiled_camera_cfg
853 num_cams = args_cli.num_tiled_cameras
854 elif standard_camera_cfg is not None:
855 camera_name_prefix = "standard_camera"
856 camera_creation_callable = create_standard_camera_cfg
857 num_cams = args_cli.num_standard_cameras
858 elif ray_caster_camera_cfg is not None:
859 camera_name_prefix = "ray_caster_camera"
860 camera_creation_callable = create_ray_caster_camera_cfg
861 num_cams = args_cli.num_ray_caster_cameras
862
863 while (
864 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
865 and cur_num_cams <= max_num_cams
866 ):
867 cur_num_cams = num_cams + interval * autotune_iter
868 autotune_iter += 1
869
870 env = inject_cameras_into_task(
871 task=args_cli.task,
872 num_cams=cur_num_cams,
873 camera_name_prefix=camera_name_prefix,
874 camera_creation_callable=camera_creation_callable,
875 num_cameras_per_env=args_cli.task_num_cameras_per_env,
876 )
877 env.reset()
878 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
879 analysis = run_simulator(
880 sim=None,
881 scene_entities=env.unwrapped.scene,
882 warm_start_length=args_cli.warm_start_length,
883 experiment_length=args_cli.experiment_length,
884 tiled_camera_data_types=args_cli.tiled_camera_data_types,
885 standard_camera_data_types=args_cli.standard_camera_data_types,
886 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
887 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
888 max_cameras_per_env=args_cli.task_num_cameras_per_env,
889 env=env,
890 )
891
892 cur_sys_util = analysis["system_utilization_analytics"]
893 final_analysis = analysis
894 print("Triggering reset...")
895 env.close()
896 sim_utils.create_new_stage()
897 print("[INFO]: DONE! Feel free to CTRL + C Me ")
898 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
899 print("Keep in mind, this is without any training running on the GPU.")
900 print("Set lower utilization thresholds to account for training.")
901
902 if not args_cli.autotune:
903 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
904
905 # Log benchmark measurements
906 if final_analysis is not None:
907 timing = final_analysis["timing_analytics"]
908 sys_util = final_analysis["system_utilization_analytics"]
909
910 # Log timing measurements
911 benchmark.add_measurement(
912 "runtime",
913 measurement=SingleMeasurement(
914 name="Average Timestep Duration", value=timing["average_timestep_duration"] * 1000, unit="ms"
915 ),
916 )
917 benchmark.add_measurement(
918 "runtime",
919 measurement=SingleMeasurement(
920 name="Average Simulation Step Duration", value=timing["average_sim_step_duration"] * 1000, unit="ms"
921 ),
922 )
923 benchmark.add_measurement(
924 "runtime",
925 measurement=SingleMeasurement(
926 name="Total Simulation Time", value=timing["total_simulation_time"] * 1000, unit="ms"
927 ),
928 )
929
930 # Log system utilization
931 benchmark.add_measurement(
932 "runtime",
933 measurement=DictMeasurement(
934 name="System Utilization",
935 value={
936 "cpu_percent": sys_util[0],
937 "ram_percent": sys_util[1],
938 "gpu_compute_percent": sys_util[2],
939 "gpu_memory_percent": sys_util[3],
940 },
941 ),
942 )
943
944 # Finalize benchmark
945 benchmark.update_manual_recorders()
946 benchmark._finalize_impl()
947
948
949if __name__ == "__main__":
950 # run the main function
951 main()
952 # close sim app
953 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.