Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml
, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py
script in the source/standalone/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings.
9
10You can supply different task environments to inject cameras into, or just test a sample scene.
11Additionally, you can automatically find the maximum amount of cameras you can run a task with
12through the auto-tune functionality.
13
14.. code-block:: bash
15
16 # Usage with GUI
17 ./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py -h
18
19 # Usage with headless
20 ./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py -h --headless
21
22"""
23
24"""Launch Isaac Sim Simulator first."""
25
26import argparse
27from collections.abc import Callable
28
29from omni.isaac.lab.app import AppLauncher
30
31# parse the arguments
32args_cli = argparse.Namespace()
33
34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
35
36"""
37The following arguments only need to be supplied for when one wishes
38to try injecting cameras into their environment, and automatically determining
39the maximum camera count.
40"""
41parser.add_argument(
42 "--task",
43 type=str,
44 default=None,
45 required=False,
46 help="Supply this argument to spawn cameras within an known manager-based task environment.",
47)
48
49parser.add_argument(
50 "--autotune",
51 default=False,
52 action="store_true",
53 help=(
54 "Autotuning is only supported for provided task environments."
55 " Supply this argument to increase the number of environments until a desired threshold is reached."
56 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
57 ),
58)
59
60parser.add_argument(
61 "--task_num_cameras_per_env",
62 type=int,
63 default=1,
64 help="The number of cameras per environment to use when using a known task.",
65)
66
67parser.add_argument(
68 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
69)
70
71parser.add_argument(
72 "--autotune_max_percentage_util",
73 nargs="+",
74 type=float,
75 default=[100.0, 80.0, 80.0, 80.0],
76 required=False,
77 help=(
78 "The system utilization percentage thresholds to reach before an autotune is finished. "
79 "If any one of these limits are hit, the autotune stops."
80 "Thresholds are, in order, maximum CPU percentage utilization,"
81 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
82 "amd maximum GPU memory utilization."
83 ),
84)
85
86parser.add_argument(
87 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
88)
89
90parser.add_argument(
91 "--autotune_camera_count_interval",
92 type=int,
93 default=25,
94 help=(
95 "The number of cameras to try to add to the environment if the current camera count"
96 " falls within permitted system resource utilization limits."
97 ),
98)
99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106 "--num_tiled_cameras",
107 type=int,
108 default=0,
109 required=False,
110 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114 "--num_standard_cameras",
115 type=int,
116 default=0,
117 required=False,
118 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122 "--num_ray_caster_cameras",
123 type=int,
124 default=0,
125 required=False,
126 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130 "--tiled_camera_data_types",
131 nargs="+",
132 type=str,
133 default=["rgb", "depth"],
134 help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138 "--standard_camera_data_types",
139 nargs="+",
140 type=str,
141 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142 help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146 "--ray_caster_camera_data_types",
147 nargs="+",
148 type=str,
149 default=["distance_to_image_plane"],
150 help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154 "--ray_caster_visible_mesh_prim_paths",
155 nargs="+",
156 type=str,
157 default=["/World/ground"],
158 help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162 "--convert_depth_to_camera_to_image_plane",
163 action="store_true",
164 default=True,
165 help=(
166 "Enable undistorting from perspective view (distance to camera data_type)"
167 "to orthogonal view (distance to plane data_type) for depth."
168 "This is currently needed to create undisorted depth images/point cloud."
169 ),
170)
171
172parser.add_argument(
173 "--keep_raw_depth",
174 dest="convert_depth_to_camera_to_image_plane",
175 action="store_false",
176 help=(
177 "Disable undistorting from perspective view (distance to camera)"
178 "to orthogonal view (distance to plane data_type) for depth."
179 ),
180)
181
182parser.add_argument(
183 "--height",
184 type=int,
185 default=120,
186 required=False,
187 help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191 "--width",
192 type=int,
193 default=140,
194 required=False,
195 help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199 "--warm_start_length",
200 type=int,
201 default=3,
202 required=False,
203 help=(
204 "Number of steps to run the sim before starting benchmark."
205 "Needed to avoid blank images at the start of the simulation."
206 ),
207)
208
209parser.add_argument(
210 "--experiment_length",
211 type=int,
212 default=15,
213 required=False,
214 help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219 "--num_objects",
220 type=int,
221 default=10,
222 required=False,
223 help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226
227AppLauncher.add_app_launcher_args(parser)
228args_cli = parser.parse_args()
229args_cli.enable_cameras = True
230
231if args_cli.autotune:
232 import pynvml
233
234if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
235 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
236# launch omniverse app
237app_launcher = AppLauncher(args_cli)
238simulation_app = app_launcher.app
239
240"""Rest everything follows."""
241
242import gymnasium as gym
243import numpy as np
244import random
245import time
246import torch
247
248import omni.isaac.core.utils.prims as prim_utils
249import psutil
250from omni.isaac.core.utils.stage import create_new_stage
251
252import omni.isaac.lab.sim as sim_utils
253from omni.isaac.lab.assets import RigidObject, RigidObjectCfg
254from omni.isaac.lab.scene.interactive_scene import InteractiveScene
255from omni.isaac.lab.sensors import (
256 Camera,
257 CameraCfg,
258 RayCasterCamera,
259 RayCasterCameraCfg,
260 TiledCamera,
261 TiledCameraCfg,
262 patterns,
263)
264from omni.isaac.lab.utils.math import orthogonalize_perspective_depth, unproject_depth
265
266from omni.isaac.lab_tasks.utils import load_cfg_from_registry
267
268"""
269Camera Creation
270"""
271
272
273def create_camera_base(
274 camera_cfg: type[CameraCfg | TiledCameraCfg],
275 num_cams: int,
276 data_types: list[str],
277 height: int,
278 width: int,
279 prim_path: str | None = None,
280 instantiate: bool = True,
281) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
282 """Generalized function to create a camera or tiled camera sensor."""
283 # Determine prim prefix based on the camera class
284 name = camera_cfg.class_type.__name__
285
286 if instantiate:
287 # Create the necessary prims
288 for idx in range(num_cams):
289 prim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
290 if prim_path is None:
291 prim_path = f"/World/{name}_.*/{name}"
292 # If valid camera settings are provided, create the camera
293 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
294 cfg = camera_cfg(
295 prim_path=prim_path,
296 update_period=0,
297 height=height,
298 width=width,
299 data_types=data_types,
300 spawn=sim_utils.PinholeCameraCfg(
301 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
302 ),
303 )
304 if instantiate:
305 return camera_cfg.class_type(cfg=cfg)
306 else:
307 return cfg
308 else:
309 return None
310
311
312def create_tiled_cameras(
313 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
314) -> TiledCamera | None:
315 if data_types is None:
316 data_types = ["rgb", "depth"]
317 """Defines the tiled camera sensor to add to the scene."""
318 return create_camera_base(
319 camera_cfg=TiledCameraCfg,
320 num_cams=num_cams,
321 data_types=data_types,
322 height=height,
323 width=width,
324 )
325
326
327def create_cameras(
328 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
329) -> Camera | None:
330 """Defines the Standard cameras."""
331 if data_types is None:
332 data_types = ["rgb", "depth"]
333 return create_camera_base(
334 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
335 )
336
337
338def create_ray_caster_cameras(
339 num_cams: int = 2,
340 data_types: list[str] = ["distance_to_image_plane"],
341 mesh_prim_paths: list[str] = ["/World/ground"],
342 height: int = 100,
343 width: int = 120,
344 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
345 instantiate: bool = True,
346) -> RayCasterCamera | RayCasterCameraCfg | None:
347 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
348 for idx in range(num_cams):
349 prim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
350
351 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
352 cam_cfg = RayCasterCameraCfg(
353 prim_path=prim_path,
354 mesh_prim_paths=mesh_prim_paths,
355 update_period=0,
356 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
357 data_types=data_types,
358 debug_vis=False,
359 pattern_cfg=patterns.PinholeCameraPatternCfg(
360 focal_length=24.0,
361 horizontal_aperture=20.955,
362 height=480,
363 width=640,
364 ),
365 )
366 if instantiate:
367 return RayCasterCamera(cfg=cam_cfg)
368 else:
369 return cam_cfg
370
371 else:
372 return None
373
374
375def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
376 """Grab a simple tiled camera config for injecting into task environments."""
377 return create_camera_base(
378 TiledCameraCfg,
379 num_cams=args_cli.num_tiled_cameras,
380 data_types=args_cli.tiled_camera_data_types,
381 width=args_cli.width,
382 height=args_cli.height,
383 prim_path="{ENV_REGEX_NS}/" + prim_path,
384 instantiate=False,
385 )
386
387
388def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
389 """Grab a simple standard camera config for injecting into task environments."""
390 return create_camera_base(
391 CameraCfg,
392 num_cams=args_cli.num_standard_cameras,
393 data_types=args_cli.standard_camera_data_types,
394 width=args_cli.width,
395 height=args_cli.height,
396 prim_path="{ENV_REGEX_NS}/" + prim_path,
397 instantiate=False,
398 )
399
400
401def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
402 """Grab a simple ray caster config for injecting into task environments."""
403 return create_ray_caster_cameras(
404 num_cams=args_cli.num_ray_caster_cameras,
405 data_types=args_cli.ray_caster_camera_data_types,
406 width=args_cli.width,
407 height=args_cli.height,
408 prim_path="{ENV_REGEX_NS}/" + prim_path,
409 )
410
411
412"""
413Scene Creation
414"""
415
416
417def design_scene(
418 num_tiled_cams: int = 2,
419 num_standard_cams: int = 0,
420 num_ray_caster_cams: int = 0,
421 tiled_camera_data_types: list[str] | None = None,
422 standard_camera_data_types: list[str] | None = None,
423 ray_caster_camera_data_types: list[str] | None = None,
424 height: int = 100,
425 width: int = 200,
426 num_objects: int = 20,
427 mesh_prim_paths: list[str] = ["/World/ground"],
428) -> dict:
429 """Design the scene."""
430 if tiled_camera_data_types is None:
431 tiled_camera_data_types = ["rgb"]
432 if standard_camera_data_types is None:
433 standard_camera_data_types = ["rgb"]
434 if ray_caster_camera_data_types is None:
435 ray_caster_camera_data_types = ["distance_to_image_plane"]
436
437 # Populate scene
438 # -- Ground-plane
439 cfg = sim_utils.GroundPlaneCfg()
440 cfg.func("/World/ground", cfg)
441 # -- Lights
442 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
443 cfg.func("/World/Light", cfg)
444
445 # Create a dictionary for the scene entities
446 scene_entities = {}
447
448 # Xform to hold objects
449 prim_utils.create_prim("/World/Objects", "Xform")
450 # Random objects
451 for i in range(num_objects):
452 # sample random position
453 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
454 position *= np.asarray([1.5, 1.5, 0.5])
455 # sample random color
456 color = (random.random(), random.random(), random.random())
457 # choose random prim type
458 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
459 common_properties = {
460 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
461 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
462 "collision_props": sim_utils.CollisionPropertiesCfg(),
463 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
464 "semantic_tags": [("class", prim_type)],
465 }
466 if prim_type == "Cube":
467 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
468 elif prim_type == "Cone":
469 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
470 elif prim_type == "Cylinder":
471 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
472 # Rigid Object
473 obj_cfg = RigidObjectCfg(
474 prim_path=f"/World/Objects/Obj_{i:02d}",
475 spawn=shape_cfg,
476 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
477 )
478 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
479
480 # Sensors
481 standard_camera = create_cameras(
482 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
483 )
484 tiled_camera = create_tiled_cameras(
485 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
486 )
487 ray_caster_camera = create_ray_caster_cameras(
488 num_cams=num_ray_caster_cams,
489 data_types=ray_caster_camera_data_types,
490 mesh_prim_paths=mesh_prim_paths,
491 height=height,
492 width=width,
493 )
494 # return the scene information
495 if tiled_camera is not None:
496 scene_entities["tiled_camera"] = tiled_camera
497 if standard_camera is not None:
498 scene_entities["standard_camera"] = standard_camera
499 if ray_caster_camera is not None:
500 scene_entities["ray_caster_camera"] = ray_caster_camera
501 return scene_entities
502
503
504def inject_cameras_into_task(
505 task: str,
506 num_cams: int,
507 camera_name_prefix: str,
508 camera_creation_callable: Callable,
509 num_cameras_per_env: int = 1,
510) -> gym.Env:
511 """Loads the task, sticks cameras into the config, and creates the environment."""
512 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
513 cfg.sim.device = args_cli.device
514 cfg.sim.use_fabric = args_cli.use_fabric
515 scene_cfg = cfg.scene
516
517 num_envs = int(num_cams / num_cameras_per_env)
518 scene_cfg.num_envs = num_envs
519
520 for idx in range(num_cameras_per_env):
521 suffix = "" if idx == 0 else str(idx)
522 name = camera_name_prefix + suffix
523 setattr(scene_cfg, name, camera_creation_callable(name))
524 cfg.scene = scene_cfg
525 env = gym.make(task, cfg=cfg)
526 return env
527
528
529"""
530System diagnosis
531"""
532
533
534def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
535 """Get the maximum CPU, RAM, GPU utilization (processing), and
536 GPU memory usage percentages since the last time reset was true."""
537 if reset:
538 max_values[:] = [0, 0, 0, 0] # Reset the max values
539
540 # CPU utilization
541 cpu_usage = psutil.cpu_percent(interval=0.1)
542 max_values[0] = max(max_values[0], cpu_usage)
543
544 # RAM utilization
545 memory_info = psutil.virtual_memory()
546 ram_usage = memory_info.percent
547 max_values[1] = max(max_values[1], ram_usage)
548
549 # GPU utilization using pynvml
550 if torch.cuda.is_available():
551
552 if args_cli.autotune:
553 pynvml.nvmlInit() # Initialize NVML
554 for i in range(torch.cuda.device_count()):
555 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
556
557 # GPU Utilization
558 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
559 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
560 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
561
562 # GPU Memory Usage
563 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
564 gpu_memory_total = memory_info.total
565 gpu_memory_used = memory_info.used
566 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
567 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
568
569 pynvml.nvmlShutdown() # Shutdown NVML after usage
570 else:
571 gpu_processing_utilization_percent = None
572 gpu_memory_utilization_percent = None
573 return max_values
574
575
576"""
577Experiment
578"""
579
580
581def run_simulator(
582 sim: sim_utils.SimulationContext | None,
583 scene_entities: dict | InteractiveScene,
584 warm_start_length: int = 10,
585 experiment_length: int = 100,
586 tiled_camera_data_types: list[str] | None = None,
587 standard_camera_data_types: list[str] | None = None,
588 ray_caster_camera_data_types: list[str] | None = None,
589 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
590 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
591 convert_depth_to_camera_to_image_plane: bool = True,
592 max_cameras_per_env: int = 1,
593 env: gym.Env | None = None,
594) -> dict:
595 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
596
597 if tiled_camera_data_types is None:
598 tiled_camera_data_types = ["rgb"]
599 if standard_camera_data_types is None:
600 standard_camera_data_types = ["rgb"]
601 if ray_caster_camera_data_types is None:
602 ray_caster_camera_data_types = ["distance_to_image_plane"]
603
604 # Initialize camera lists
605 tiled_cameras = []
606 standard_cameras = []
607 ray_caster_cameras = []
608
609 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
610 for i in range(max_cameras_per_env):
611 # Extract tiled cameras
612 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
613 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
614 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
615
616 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
617 tiled_cameras.append(scene_entities[tiled_camera_key])
618 standard_cameras.append(scene_entities[standard_camera_key])
619 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
620 except KeyError:
621 break
622
623 # Initialize camera counts
624 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
625 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
626 labels = ["tiled", "standard", "ray_caster"]
627
628 if sim is not None:
629 # Set camera world poses
630 for camera_list in camera_lists:
631 for camera in camera_list:
632 num_cameras = camera.data.intrinsic_matrices.size(0)
633 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
634 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
635 camera.set_world_poses_from_view(positions, targets)
636
637 # Initialize timing variables
638 timestep = 0
639 total_time = 0.0
640 valid_timesteps = 0
641 sim_step_time = 0.0
642
643 while simulation_app.is_running() and timestep < experiment_length:
644 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
645 get_utilization_percentages()
646
647 # Measure the total simulation step time
648 step_start_time = time.time()
649
650 if sim is not None:
651 sim.step()
652
653 if env is not None:
654 with torch.inference_mode():
655 # compute zero actions
656 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
657 # apply actions
658 env.step(actions)
659
660 # Update cameras and process vision data within the simulation step
661 clouds = {}
662 images = {}
663 depth_images = {}
664
665 # Loop through all camera lists and their data_types
666 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
667 for cam_idx, camera in enumerate(camera_list):
668
669 if env is None: # No env, need to step cams manually
670 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
671 camera.update(dt=sim.get_physics_dt())
672
673 for data_type in data_types:
674 data_label = f"{label}_{cam_idx}_{data_type}"
675
676 if depth_predicate(data_type): # is a depth image, want to create cloud
677 depth = camera.data.output[data_type]
678 depth_images[data_label + "_raw"] = depth
679 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
680 depth = orthogonalize_perspective_depth(
681 camera.data.output[data_type], camera.data.intrinsic_matrices
682 )
683 depth_images[data_label + "_undistorted"] = depth
684
685 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
686 clouds[data_label] = pointcloud
687 else: # rgb image, just save it
688 image = camera.data.output[data_type]
689 images[data_label] = image
690
691 # End timing for the step
692 step_end_time = time.time()
693 sim_step_time += step_end_time - step_start_time
694
695 if timestep > warm_start_length:
696 get_utilization_percentages(reset=True)
697 total_time += step_end_time - step_start_time
698 valid_timesteps += 1
699
700 timestep += 1
701
702 # Calculate average timings
703 if valid_timesteps > 0:
704 avg_timestep_duration = total_time / valid_timesteps
705 avg_sim_step_duration = sim_step_time / experiment_length
706 else:
707 avg_timestep_duration = 0.0
708 avg_sim_step_duration = 0.0
709
710 # Package timing analytics in a dictionary
711 timing_analytics = {
712 "average_timestep_duration": avg_timestep_duration,
713 "average_sim_step_duration": avg_sim_step_duration,
714 "total_simulation_time": sim_step_time,
715 "total_experiment_duration": sim_step_time,
716 }
717
718 system_utilization_analytics = get_utilization_percentages()
719
720 print("--- Benchmark Results ---")
721 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
722 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
723 print(f"Total simulation time: {sim_step_time:.6f} seconds")
724 print("\nSystem Utilization Statistics:")
725 print(
726 f"| CPU:{system_utilization_analytics[0]}% | "
727 f"RAM:{system_utilization_analytics[1]}% | "
728 f"GPU Compute:{system_utilization_analytics[2]}% | "
729 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
730 )
731
732 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
733
734
735def main():
736 """Main function."""
737 # Load simulation context
738 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
739 raise ValueError("You must select at least one camera.")
740 if (
741 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
742 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
743 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
744 ):
745 print("[WARNING]: You have elected to use more than one camera type.")
746 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
747 print(
748 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
749 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
750 )
751 raise ValueError("Benchmark one camera at a time.")
752
753 print("[INFO]: Designing the scene")
754 if args_cli.task is None:
755 print("[INFO]: No task environment provided, creating random scene.")
756 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
757 sim = sim_utils.SimulationContext(sim_cfg)
758 # Set main camera
759 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
760 scene_entities = design_scene(
761 num_tiled_cams=args_cli.num_tiled_cameras,
762 num_standard_cams=args_cli.num_standard_cameras,
763 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
764 tiled_camera_data_types=args_cli.tiled_camera_data_types,
765 standard_camera_data_types=args_cli.standard_camera_data_types,
766 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
767 height=args_cli.height,
768 width=args_cli.width,
769 num_objects=args_cli.num_objects,
770 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
771 )
772 # Play simulator
773 sim.reset()
774 # Now we are ready!
775 print("[INFO]: Setup complete...")
776 # Run simulator
777 run_simulator(
778 sim=sim,
779 scene_entities=scene_entities,
780 warm_start_length=args_cli.warm_start_length,
781 experiment_length=args_cli.experiment_length,
782 tiled_camera_data_types=args_cli.tiled_camera_data_types,
783 standard_camera_data_types=args_cli.standard_camera_data_types,
784 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
785 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
786 )
787 else:
788 print("[INFO]: Using known task environment, injecting cameras.")
789 autotune_iter = 0
790 max_sys_util_thresh = [0.0, 0.0, 0.0]
791 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
792 cur_num_cams = max_num_cams
793 cur_sys_util = max_sys_util_thresh
794 interval = args_cli.autotune_camera_count_interval
795
796 if args_cli.autotune:
797 max_sys_util_thresh = args_cli.autotune_max_percentage_util
798 max_num_cams = args_cli.autotune_max_camera_count
799 print("[INFO]: Auto tuning until any of the following threshold are met")
800 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
801 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
802 # Determine which camera is being tested...
803 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
804 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
805 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
806 camera_name_prefix = ""
807 camera_creation_callable = None
808 num_cams = 0
809 if tiled_camera_cfg is not None:
810 camera_name_prefix = "tiled_camera"
811 camera_creation_callable = create_tiled_camera_cfg
812 num_cams = args_cli.num_tiled_cameras
813 elif standard_camera_cfg is not None:
814 camera_name_prefix = "standard_camera"
815 camera_creation_callable = create_standard_camera_cfg
816 num_cams = args_cli.num_standard_cameras
817 elif ray_caster_camera_cfg is not None:
818 camera_name_prefix = "ray_caster_camera"
819 camera_creation_callable = create_ray_caster_camera_cfg
820 num_cams = args_cli.num_ray_caster_cameras
821
822 while (
823 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
824 and cur_num_cams <= max_num_cams
825 ):
826 cur_num_cams = num_cams + interval * autotune_iter
827 autotune_iter += 1
828
829 env = inject_cameras_into_task(
830 task=args_cli.task,
831 num_cams=cur_num_cams,
832 camera_name_prefix=camera_name_prefix,
833 camera_creation_callable=camera_creation_callable,
834 num_cameras_per_env=args_cli.task_num_cameras_per_env,
835 )
836 env.reset()
837 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
838 analysis = run_simulator(
839 sim=None,
840 scene_entities=env.unwrapped.scene,
841 warm_start_length=args_cli.warm_start_length,
842 experiment_length=args_cli.experiment_length,
843 tiled_camera_data_types=args_cli.tiled_camera_data_types,
844 standard_camera_data_types=args_cli.standard_camera_data_types,
845 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
846 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
847 max_cameras_per_env=args_cli.task_num_cameras_per_env,
848 env=env,
849 )
850
851 cur_sys_util = analysis["system_utilization_analytics"]
852 print("Triggering reset...")
853 env.close()
854 create_new_stage()
855 print("[INFO]: DONE! Feel free to CTRL + C Me ")
856 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
857 print("Keep in mind, this is without any training running on the GPU.")
858 print("Set lower utilization thresholds to account for training.")
859
860 if not args_cli.autotune:
861 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
862
863
864if __name__ == "__main__":
865 # run the main function
866 main()
867 # close sim app
868 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune
for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml
), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p source/standalone/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop
and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless
)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.