Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py script in the scripts/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings.
9
10You can supply different task environments to inject cameras into, or just test a sample scene.
11Additionally, you can automatically find the maximum amount of cameras you can run a task with
12through the auto-tune functionality.
13
14.. code-block:: bash
15
16 # Usage with GUI
17 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
18
19 # Usage with headless
20 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
21
22"""
23
24"""Launch Isaac Sim Simulator first."""
25
26import argparse
27from collections.abc import Callable
28from dataclasses import MISSING
29
30from isaaclab.app import AppLauncher
31
32# parse the arguments
33args_cli = argparse.Namespace()
34
35parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
36
37"""
38The following arguments only need to be supplied for when one wishes
39to try injecting cameras into their environment, and automatically determining
40the maximum camera count.
41"""
42parser.add_argument(
43 "--task",
44 type=str,
45 default=None,
46 required=False,
47 help="Supply this argument to spawn cameras within an known manager-based task environment.",
48)
49
50parser.add_argument(
51 "--autotune",
52 default=False,
53 action="store_true",
54 help=(
55 "Autotuning is only supported for provided task environments."
56 " Supply this argument to increase the number of environments until a desired threshold is reached."
57 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
58 ),
59)
60
61parser.add_argument(
62 "--task_num_cameras_per_env",
63 type=int,
64 default=1,
65 help="The number of cameras per environment to use when using a known task.",
66)
67
68parser.add_argument(
69 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
70)
71
72parser.add_argument(
73 "--autotune_max_percentage_util",
74 nargs="+",
75 type=float,
76 default=[100.0, 80.0, 80.0, 80.0],
77 required=False,
78 help=(
79 "The system utilization percentage thresholds to reach before an autotune is finished. "
80 "If any one of these limits are hit, the autotune stops."
81 "Thresholds are, in order, maximum CPU percentage utilization,"
82 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
83 "amd maximum GPU memory utilization."
84 ),
85)
86
87parser.add_argument(
88 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
89)
90
91parser.add_argument(
92 "--autotune_camera_count_interval",
93 type=int,
94 default=25,
95 help=(
96 "The number of cameras to try to add to the environment if the current camera count"
97 " falls within permitted system resource utilization limits."
98 ),
99)
100
101"""
102The following arguments are shared for when injecting cameras into a task environment,
103as well as when creating cameras independent of a task environment.
104"""
105
106parser.add_argument(
107 "--num_tiled_cameras",
108 type=int,
109 default=0,
110 required=False,
111 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
112)
113
114parser.add_argument(
115 "--num_standard_cameras",
116 type=int,
117 default=0,
118 required=False,
119 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
120)
121
122parser.add_argument(
123 "--num_ray_caster_cameras",
124 type=int,
125 default=0,
126 required=False,
127 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
128)
129
130parser.add_argument(
131 "--tiled_camera_data_types",
132 nargs="+",
133 type=str,
134 default=["rgb", "depth"],
135 help="The data types rendered by the tiled camera",
136)
137
138parser.add_argument(
139 "--standard_camera_data_types",
140 nargs="+",
141 type=str,
142 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
143 help="The data types rendered by the standard camera",
144)
145
146parser.add_argument(
147 "--ray_caster_camera_data_types",
148 nargs="+",
149 type=str,
150 default=["distance_to_image_plane"],
151 help="The data types rendered by the ray caster camera.",
152)
153
154parser.add_argument(
155 "--ray_caster_visible_mesh_prim_paths",
156 nargs="+",
157 type=str,
158 default=["/World/ground"],
159 help="WARNING: Ray Caster can currently only cast against a single, static, object",
160)
161
162parser.add_argument(
163 "--convert_depth_to_camera_to_image_plane",
164 action="store_true",
165 default=True,
166 help=(
167 "Enable undistorting from perspective view (distance to camera data_type)"
168 "to orthogonal view (distance to plane data_type) for depth."
169 "This is currently needed to create undisorted depth images/point cloud."
170 ),
171)
172
173parser.add_argument(
174 "--keep_raw_depth",
175 dest="convert_depth_to_camera_to_image_plane",
176 action="store_false",
177 help=(
178 "Disable undistorting from perspective view (distance to camera)"
179 "to orthogonal view (distance to plane data_type) for depth."
180 ),
181)
182
183parser.add_argument(
184 "--height",
185 type=int,
186 default=120,
187 required=False,
188 help="Height in pixels of cameras",
189)
190
191parser.add_argument(
192 "--width",
193 type=int,
194 default=140,
195 required=False,
196 help="Width in pixels of cameras",
197)
198
199parser.add_argument(
200 "--warm_start_length",
201 type=int,
202 default=3,
203 required=False,
204 help=(
205 "Number of steps to run the sim before starting benchmark."
206 "Needed to avoid blank images at the start of the simulation."
207 ),
208)
209
210parser.add_argument(
211 "--experiment_length",
212 type=int,
213 default=15,
214 required=False,
215 help="Number of steps to average over",
216)
217
218# This argument is only used when a task is not provided.
219parser.add_argument(
220 "--num_objects",
221 type=int,
222 default=10,
223 required=False,
224 help="Number of objects to spawn into the scene when not using a known task.",
225)
226
227# Benchmark arguments
228parser.add_argument(
229 "--benchmark_backend",
230 type=str,
231 default="omniperf",
232 choices=["json", "osmo", "omniperf", "summary"],
233 help="Benchmarking backend options, defaults omniperf",
234)
235parser.add_argument("--output_path", type=str, default=".", help="Path to output benchmark results.")
236
237
238AppLauncher.add_app_launcher_args(parser)
239args_cli = parser.parse_args()
240args_cli.enable_cameras = True
241
242if args_cli.autotune:
243 import pynvml
244
245if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
246 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
247# launch omniverse app
248app_launcher = AppLauncher(args_cli)
249simulation_app = app_launcher.app
250
251"""Rest everything follows."""
252
253import random
254import time
255
256import gymnasium as gym
257import numpy as np
258import psutil
259import torch
260
261import isaaclab.sim as sim_utils
262from isaaclab.assets import RigidObject, RigidObjectCfg
263from isaaclab.scene.interactive_scene import InteractiveScene
264from isaaclab.sensors import (
265 Camera,
266 CameraCfg,
267 RayCasterCamera,
268 RayCasterCameraCfg,
269 patterns,
270)
271from isaaclab.test.benchmark import BaseIsaacLabBenchmark, DictMeasurement, SingleMeasurement
272from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
273
274from isaaclab_tasks.utils import load_cfg_from_registry
275
276"""
277Camera Creation
278"""
279
280
281def _get_camera_class_name(camera_cfg: type[CameraCfg]) -> str:
282 """Return the configured camera sensor class name."""
283 class_type_field = camera_cfg.__dataclass_fields__["class_type"]
284 if class_type_field.default is not MISSING:
285 class_type = class_type_field.default
286 elif class_type_field.default_factory is not MISSING:
287 class_type = class_type_field.default_factory()
288 else:
289 raise AttributeError(f"{camera_cfg.__name__} has no default class_type.")
290
291 if hasattr(class_type, "__name__"):
292 return class_type.__name__
293 return str(class_type).rsplit(":", maxsplit=1)[-1]
294
295
296def create_camera_base(
297 camera_cfg: type[CameraCfg],
298 num_cams: int,
299 data_types: list[str],
300 height: int,
301 width: int,
302 prim_path: str | None = None,
303 instantiate: bool = True,
304) -> Camera | CameraCfg | None:
305 """Generalized function to create a camera or tiled camera sensor."""
306 # If valid camera settings are provided, create the camera
307 if num_cams <= 0 or len(data_types) <= 0 or height <= 0 or width <= 0:
308 return None
309
310 name = _get_camera_class_name(camera_cfg)
311 cfg = camera_cfg(
312 prim_path=prim_path if prim_path is not None else f"/World/{name}_.*/{name}",
313 update_period=0,
314 height=height,
315 width=width,
316 data_types=data_types,
317 spawn=sim_utils.PinholeCameraCfg(
318 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
319 ),
320 )
321 if instantiate:
322 # Create the necessary prims
323 for idx in range(num_cams):
324 sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
325 return cfg.class_type(cfg=cfg)
326
327 return cfg
328
329
330def create_tiled_cameras(
331 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
332) -> Camera | None:
333 if data_types is None:
334 data_types = ["rgb", "depth"]
335 """Defines the camera sensor to add to the scene."""
336 return create_camera_base(
337 camera_cfg=CameraCfg,
338 num_cams=num_cams,
339 data_types=data_types,
340 height=height,
341 width=width,
342 )
343
344
345def create_cameras(
346 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
347) -> Camera | None:
348 """Defines the Standard cameras."""
349 if data_types is None:
350 data_types = ["rgb", "depth"]
351 return create_camera_base(
352 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
353 )
354
355
356def create_ray_caster_cameras(
357 num_cams: int = 2,
358 data_types: list[str] = ["distance_to_image_plane"],
359 mesh_prim_paths: list[str] = ["/World/ground"],
360 height: int = 100,
361 width: int = 120,
362 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
363 instantiate: bool = True,
364) -> RayCasterCamera | RayCasterCameraCfg | None:
365 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
366 for idx in range(num_cams):
367 sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
368
369 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
370 cam_cfg = RayCasterCameraCfg(
371 prim_path=prim_path,
372 mesh_prim_paths=mesh_prim_paths,
373 update_period=0,
374 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
375 data_types=data_types,
376 debug_vis=False,
377 pattern_cfg=patterns.PinholeCameraPatternCfg(
378 focal_length=24.0,
379 horizontal_aperture=20.955,
380 height=480,
381 width=640,
382 ),
383 )
384 if instantiate:
385 return RayCasterCamera(cfg=cam_cfg)
386 else:
387 return cam_cfg
388
389 else:
390 return None
391
392
393def create_tiled_camera_cfg(prim_path: str) -> CameraCfg:
394 """Grab a simple camera config for injecting into task environments."""
395 return create_camera_base(
396 CameraCfg,
397 num_cams=args_cli.num_tiled_cameras,
398 data_types=args_cli.tiled_camera_data_types,
399 width=args_cli.width,
400 height=args_cli.height,
401 prim_path="{ENV_REGEX_NS}/" + prim_path,
402 instantiate=False,
403 )
404
405
406def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
407 """Grab a simple standard camera config for injecting into task environments."""
408 return create_camera_base(
409 CameraCfg,
410 num_cams=args_cli.num_standard_cameras,
411 data_types=args_cli.standard_camera_data_types,
412 width=args_cli.width,
413 height=args_cli.height,
414 prim_path="{ENV_REGEX_NS}/" + prim_path,
415 instantiate=False,
416 )
417
418
419def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
420 """Grab a simple ray caster config for injecting into task environments."""
421 return create_ray_caster_cameras(
422 num_cams=args_cli.num_ray_caster_cameras,
423 data_types=args_cli.ray_caster_camera_data_types,
424 width=args_cli.width,
425 height=args_cli.height,
426 prim_path="{ENV_REGEX_NS}/" + prim_path,
427 )
428
429
430"""
431Scene Creation
432"""
433
434
435def design_scene(
436 num_tiled_cams: int = 2,
437 num_standard_cams: int = 0,
438 num_ray_caster_cams: int = 0,
439 tiled_camera_data_types: list[str] | None = None,
440 standard_camera_data_types: list[str] | None = None,
441 ray_caster_camera_data_types: list[str] | None = None,
442 height: int = 100,
443 width: int = 200,
444 num_objects: int = 20,
445 mesh_prim_paths: list[str] = ["/World/ground"],
446) -> dict:
447 """Design the scene."""
448 if tiled_camera_data_types is None:
449 tiled_camera_data_types = ["rgb"]
450 if standard_camera_data_types is None:
451 standard_camera_data_types = ["rgb"]
452 if ray_caster_camera_data_types is None:
453 ray_caster_camera_data_types = ["distance_to_image_plane"]
454
455 # Populate scene
456 # -- Ground-plane
457 cfg = sim_utils.GroundPlaneCfg()
458 cfg.func("/World/ground", cfg)
459 # -- Lights
460 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
461 cfg.func("/World/Light", cfg)
462
463 # Create a dictionary for the scene entities
464 scene_entities = {}
465
466 # Xform to hold objects
467 sim_utils.create_prim("/World/Objects", "Xform")
468 # Random objects
469 for i in range(num_objects):
470 # sample random position
471 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
472 position *= np.asarray([1.5, 1.5, 0.5])
473 # sample random color
474 color = (random.random(), random.random(), random.random())
475 # choose random prim type
476 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
477 common_properties = {
478 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
479 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
480 "collision_props": sim_utils.CollisionPropertiesCfg(),
481 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
482 "semantic_tags": [("class", prim_type)],
483 }
484 if prim_type == "Cube":
485 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
486 elif prim_type == "Cone":
487 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
488 elif prim_type == "Cylinder":
489 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
490 # Rigid Object
491 obj_cfg = RigidObjectCfg(
492 prim_path=f"/World/Objects/Obj_{i:02d}",
493 spawn=shape_cfg,
494 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
495 )
496 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
497
498 # Sensors
499 standard_camera = create_cameras(
500 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
501 )
502 tiled_camera = create_tiled_cameras(
503 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
504 )
505 ray_caster_camera = create_ray_caster_cameras(
506 num_cams=num_ray_caster_cams,
507 data_types=ray_caster_camera_data_types,
508 mesh_prim_paths=mesh_prim_paths,
509 height=height,
510 width=width,
511 )
512 # return the scene information
513 if tiled_camera is not None:
514 scene_entities["tiled_camera"] = tiled_camera
515 if standard_camera is not None:
516 scene_entities["standard_camera"] = standard_camera
517 if ray_caster_camera is not None:
518 scene_entities["ray_caster_camera"] = ray_caster_camera
519 return scene_entities
520
521
522def inject_cameras_into_task(
523 task: str,
524 num_cams: int,
525 camera_name_prefix: str,
526 camera_creation_callable: Callable,
527 num_cameras_per_env: int = 1,
528) -> gym.Env:
529 """Loads the task, sticks cameras into the config, and creates the environment."""
530 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
531 cfg.sim.device = args_cli.device
532 cfg.sim.use_fabric = args_cli.use_fabric
533 scene_cfg = cfg.scene
534
535 num_envs = int(num_cams / num_cameras_per_env)
536 scene_cfg.num_envs = num_envs
537
538 for idx in range(num_cameras_per_env):
539 suffix = "" if idx == 0 else str(idx)
540 name = camera_name_prefix + suffix
541 setattr(scene_cfg, name, camera_creation_callable(name))
542 cfg.scene = scene_cfg
543 env = gym.make(task, cfg=cfg)
544 return env
545
546
547"""
548System diagnosis
549"""
550
551
552def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
553 """Get the maximum CPU, RAM, GPU utilization (processing), and
554 GPU memory usage percentages since the last time reset was true."""
555 if reset:
556 max_values[:] = [0, 0, 0, 0] # Reset the max values
557
558 # CPU utilization
559 cpu_usage = psutil.cpu_percent(interval=0.1)
560 max_values[0] = max(max_values[0], cpu_usage)
561
562 # RAM utilization
563 memory_info = psutil.virtual_memory()
564 ram_usage = memory_info.percent
565 max_values[1] = max(max_values[1], ram_usage)
566
567 # GPU utilization using pynvml
568 if torch.cuda.is_available():
569 if args_cli.autotune:
570 pynvml.nvmlInit() # Initialize NVML
571 for i in range(torch.cuda.device_count()):
572 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
573
574 # GPU Utilization
575 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
576 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
577 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
578
579 # GPU Memory Usage
580 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
581 gpu_memory_total = memory_info.total
582 gpu_memory_used = memory_info.used
583 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
584 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
585
586 pynvml.nvmlShutdown() # Shutdown NVML after usage
587 else:
588 gpu_processing_utilization_percent = None
589 gpu_memory_utilization_percent = None
590 return max_values
591
592
593"""
594Experiment
595"""
596
597
598def run_simulator(
599 sim: sim_utils.SimulationContext | None,
600 scene_entities: dict | InteractiveScene,
601 warm_start_length: int = 10,
602 experiment_length: int = 100,
603 tiled_camera_data_types: list[str] | None = None,
604 standard_camera_data_types: list[str] | None = None,
605 ray_caster_camera_data_types: list[str] | None = None,
606 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
607 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
608 convert_depth_to_camera_to_image_plane: bool = True,
609 max_cameras_per_env: int = 1,
610 env: gym.Env | None = None,
611) -> dict:
612 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
613
614 if tiled_camera_data_types is None:
615 tiled_camera_data_types = ["rgb"]
616 if standard_camera_data_types is None:
617 standard_camera_data_types = ["rgb"]
618 if ray_caster_camera_data_types is None:
619 ray_caster_camera_data_types = ["distance_to_image_plane"]
620
621 # Initialize camera lists
622 tiled_cameras = []
623 standard_cameras = []
624 ray_caster_cameras = []
625
626 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
627 for i in range(max_cameras_per_env):
628 # Extract tiled cameras
629 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
630 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
631 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
632
633 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
634 tiled_cameras.append(scene_entities[tiled_camera_key])
635 standard_cameras.append(scene_entities[standard_camera_key])
636 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
637 except KeyError:
638 break
639
640 # Initialize camera counts
641 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
642 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
643 labels = ["tiled", "standard", "ray_caster"]
644
645 if sim is not None:
646 # Set camera world poses
647 for camera_list in camera_lists:
648 for camera in camera_list:
649 num_cameras = camera.data.intrinsic_matrices.size(0)
650 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
651 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
652 camera.set_world_poses_from_view(positions, targets)
653
654 # Initialize timing variables
655 timestep = 0
656 total_time = 0.0
657 valid_timesteps = 0
658 sim_step_time = 0.0
659
660 while simulation_app.is_running() and timestep < experiment_length:
661 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
662 get_utilization_percentages()
663
664 # Measure the total simulation step time
665 step_start_time = time.time()
666
667 if sim is not None:
668 sim.step()
669
670 if env is not None:
671 with torch.inference_mode():
672 # compute zero actions
673 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
674 # apply actions
675 env.step(actions)
676
677 # Update cameras and process vision data within the simulation step
678 clouds = {}
679 images = {}
680 depth_images = {}
681
682 # Loop through all camera lists and their data_types
683 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
684 for cam_idx, camera in enumerate(camera_list):
685 if env is None: # No env, need to step cams manually
686 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
687 camera.update(dt=sim.get_physics_dt())
688
689 for data_type in data_types:
690 data_label = f"{label}_{cam_idx}_{data_type}"
691
692 if depth_predicate(data_type): # is a depth image, want to create cloud
693 depth = camera.data.output[data_type]
694 depth_images[data_label + "_raw"] = depth
695 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
696 depth = orthogonalize_perspective_depth(
697 camera.data.output[data_type], camera.data.intrinsic_matrices
698 )
699 depth_images[data_label + "_undistorted"] = depth
700
701 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
702 clouds[data_label] = pointcloud
703 else: # rgb image, just save it
704 image = camera.data.output[data_type]
705 images[data_label] = image
706
707 # End timing for the step
708 step_end_time = time.time()
709 sim_step_time += step_end_time - step_start_time
710
711 if timestep > warm_start_length:
712 get_utilization_percentages(reset=True)
713 total_time += step_end_time - step_start_time
714 valid_timesteps += 1
715
716 timestep += 1
717
718 # Calculate average timings
719 if valid_timesteps > 0:
720 avg_timestep_duration = total_time / valid_timesteps
721 avg_sim_step_duration = sim_step_time / experiment_length
722 else:
723 avg_timestep_duration = 0.0
724 avg_sim_step_duration = 0.0
725
726 # Package timing analytics in a dictionary
727 timing_analytics = {
728 "average_timestep_duration": avg_timestep_duration,
729 "average_sim_step_duration": avg_sim_step_duration,
730 "total_simulation_time": sim_step_time,
731 "total_experiment_duration": sim_step_time,
732 }
733
734 system_utilization_analytics = get_utilization_percentages()
735
736 print("--- Benchmark Results ---")
737 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
738 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
739 print(f"Total simulation time: {sim_step_time:.6f} seconds")
740 print("\nSystem Utilization Statistics:")
741 print(
742 f"| CPU:{system_utilization_analytics[0]}% | "
743 f"RAM:{system_utilization_analytics[1]}% | "
744 f"GPU Compute:{system_utilization_analytics[2]}% | "
745 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
746 )
747
748 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
749
750
751def main():
752 """Main function."""
753 # Load simulation context
754 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
755 raise ValueError("You must select at least one camera.")
756 if (
757 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
758 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
759 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
760 ):
761 print("[WARNING]: You have elected to use more than one camera type.")
762 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
763 print(
764 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
765 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
766 )
767 raise ValueError("Benchmark one camera at a time.")
768
769 # Determine which camera type is being used
770 camera_type = "tiled"
771 num_cameras = args_cli.num_tiled_cameras
772 if args_cli.num_standard_cameras > 0:
773 camera_type = "standard"
774 num_cameras = args_cli.num_standard_cameras
775 elif args_cli.num_ray_caster_cameras > 0:
776 camera_type = "ray_caster"
777 num_cameras = args_cli.num_ray_caster_cameras
778
779 # Create the benchmark
780 backend_type = args_cli.benchmark_backend
781 benchmark = BaseIsaacLabBenchmark(
782 benchmark_name="benchmark_cameras",
783 backend_type=backend_type,
784 output_path=args_cli.output_path,
785 use_recorders=True,
786 frametime_recorders=backend_type in ("summary", "omniperf"),
787 output_prefix="benchmark_cameras",
788 workflow_metadata={
789 "metadata": [
790 {"name": "task", "data": args_cli.task},
791 {"name": "camera_type", "data": camera_type},
792 {"name": "num_cameras", "data": num_cameras},
793 {"name": "height", "data": args_cli.height},
794 {"name": "width", "data": args_cli.width},
795 {"name": "experiment_length", "data": args_cli.experiment_length},
796 {"name": "autotune", "data": args_cli.autotune},
797 ]
798 },
799 )
800
801 print("[INFO]: Designing the scene")
802 final_analysis = None
803
804 if args_cli.task is None:
805 print("[INFO]: No task environment provided, creating random scene.")
806 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
807 sim = sim_utils.SimulationContext(sim_cfg)
808 # Set main camera
809 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
810 scene_entities = design_scene(
811 num_tiled_cams=args_cli.num_tiled_cameras,
812 num_standard_cams=args_cli.num_standard_cameras,
813 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
814 tiled_camera_data_types=args_cli.tiled_camera_data_types,
815 standard_camera_data_types=args_cli.standard_camera_data_types,
816 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
817 height=args_cli.height,
818 width=args_cli.width,
819 num_objects=args_cli.num_objects,
820 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
821 )
822 # Play simulator
823 sim.reset()
824 # Now we are ready!
825 print("[INFO]: Setup complete...")
826 # Run simulator
827 final_analysis = run_simulator(
828 sim=sim,
829 scene_entities=scene_entities,
830 warm_start_length=args_cli.warm_start_length,
831 experiment_length=args_cli.experiment_length,
832 tiled_camera_data_types=args_cli.tiled_camera_data_types,
833 standard_camera_data_types=args_cli.standard_camera_data_types,
834 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
835 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
836 )
837 else:
838 print("[INFO]: Using known task environment, injecting cameras.")
839 autotune_iter = 0
840 max_sys_util_thresh = [0.0, 0.0, 0.0]
841 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
842 cur_num_cams = max_num_cams
843 cur_sys_util = max_sys_util_thresh
844 interval = args_cli.autotune_camera_count_interval
845
846 if args_cli.autotune:
847 max_sys_util_thresh = args_cli.autotune_max_percentage_util
848 max_num_cams = args_cli.autotune_max_camera_count
849 print("[INFO]: Auto tuning until any of the following threshold are met")
850 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
851 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
852 # Determine which camera is being tested...
853 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
854 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
855 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
856 camera_name_prefix = ""
857 camera_creation_callable = None
858 num_cams = 0
859 if tiled_camera_cfg is not None:
860 camera_name_prefix = "tiled_camera"
861 camera_creation_callable = create_tiled_camera_cfg
862 num_cams = args_cli.num_tiled_cameras
863 elif standard_camera_cfg is not None:
864 camera_name_prefix = "standard_camera"
865 camera_creation_callable = create_standard_camera_cfg
866 num_cams = args_cli.num_standard_cameras
867 elif ray_caster_camera_cfg is not None:
868 camera_name_prefix = "ray_caster_camera"
869 camera_creation_callable = create_ray_caster_camera_cfg
870 num_cams = args_cli.num_ray_caster_cameras
871
872 while (
873 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
874 and cur_num_cams <= max_num_cams
875 ):
876 cur_num_cams = num_cams + interval * autotune_iter
877 autotune_iter += 1
878
879 env = inject_cameras_into_task(
880 task=args_cli.task,
881 num_cams=cur_num_cams,
882 camera_name_prefix=camera_name_prefix,
883 camera_creation_callable=camera_creation_callable,
884 num_cameras_per_env=args_cli.task_num_cameras_per_env,
885 )
886 env.reset()
887 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
888 analysis = run_simulator(
889 sim=None,
890 scene_entities=env.unwrapped.scene,
891 warm_start_length=args_cli.warm_start_length,
892 experiment_length=args_cli.experiment_length,
893 tiled_camera_data_types=args_cli.tiled_camera_data_types,
894 standard_camera_data_types=args_cli.standard_camera_data_types,
895 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
896 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
897 max_cameras_per_env=args_cli.task_num_cameras_per_env,
898 env=env,
899 )
900
901 cur_sys_util = analysis["system_utilization_analytics"]
902 final_analysis = analysis
903 print("Triggering reset...")
904 env.close()
905 sim_utils.create_new_stage()
906 print("[INFO]: DONE! Feel free to CTRL + C Me ")
907 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
908 print("Keep in mind, this is without any training running on the GPU.")
909 print("Set lower utilization thresholds to account for training.")
910
911 if not args_cli.autotune:
912 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
913
914 # Log benchmark measurements
915 if final_analysis is not None:
916 timing = final_analysis["timing_analytics"]
917 sys_util = final_analysis["system_utilization_analytics"]
918
919 # Log timing measurements
920 benchmark.add_measurement(
921 "runtime",
922 measurement=SingleMeasurement(
923 name="Average Timestep Duration", value=timing["average_timestep_duration"] * 1000, unit="ms"
924 ),
925 )
926 benchmark.add_measurement(
927 "runtime",
928 measurement=SingleMeasurement(
929 name="Average Simulation Step Duration", value=timing["average_sim_step_duration"] * 1000, unit="ms"
930 ),
931 )
932 benchmark.add_measurement(
933 "runtime",
934 measurement=SingleMeasurement(
935 name="Total Simulation Time", value=timing["total_simulation_time"] * 1000, unit="ms"
936 ),
937 )
938
939 # Log system utilization
940 benchmark.add_measurement(
941 "runtime",
942 measurement=DictMeasurement(
943 name="System Utilization",
944 value={
945 "cpu_percent": sys_util[0],
946 "ram_percent": sys_util[1],
947 "gpu_compute_percent": sys_util[2],
948 "gpu_memory_percent": sys_util[3],
949 },
950 ),
951 )
952
953 # Finalize benchmark
954 benchmark.update_manual_recorders()
955 benchmark._finalize_impl()
956
957
958if __name__ == "__main__":
959 # run the main function
960 main()
961 # close sim app
962 simulation_app.close()
Possible Parameters#
First, run
python scripts/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
python scripts/benchmarks/benchmark_cameras.py --task Isaac-Cartpole-v0 --num_tiled_cameras 100 --task_num_cameras_per_env 2 --tiled_camera_data_types rgb
If you have pynvml installed, (python -m pip install pynvml), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
python scripts/benchmarks/benchmark_cameras.py --task Isaac-Cartpole-v0 --num_tiled_cameras 100 --task_num_cameras_per_env 2 --tiled_camera_data_types rgb --autotune --autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
python scripts/benchmarks/benchmark_cameras.py --height 100 --width 100 --num_standard_cameras 2 --standard_camera_data_types instance_segmentation_fast normals --num_objects 100 --experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.