Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml
, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py
script in the IsaacLab/source/standalone/tutorials/04_sensors
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2024, The Isaac Lab Project Developers.
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings. You can supply different task environments
9to inject cameras into, or just test a sample scene. Additionally,
10you can automatically find the maximum amount of cameras you can run a task with through the
11autotune functionality.
12
13.. code-block:: bash
14
15 # Usage with GUI
16 ./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py -h
17
18 # Usage with headless
19 ./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py -h --headless
20
21"""
22
23"""Launch Isaac Sim Simulator first."""
24
25import argparse
26from collections.abc import Callable
27
28from omni.isaac.lab.app import AppLauncher
29
30# parse the arguments
31args_cli = argparse.Namespace()
32
33parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
34
35"""
36The following arguments only need to be supplied for when one wishes
37to try injecting cameras into their environment, and automatically determining
38the maximum camera count.
39"""
40parser.add_argument(
41 "--task",
42 type=str,
43 default=None,
44 required=False,
45 help="Supply this argument to spawn cameras within an known manager-based task environment.",
46)
47
48parser.add_argument(
49 "--autotune",
50 default=False,
51 action="store_true",
52 help=(
53 "Autotuning is only supported for provided task environments."
54 " Supply this argument to increase the number of environments until a desired threshold is reached."
55 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
56 ),
57)
58
59parser.add_argument(
60 "--task_num_cameras_per_env",
61 type=int,
62 default=1,
63 help="The number of cameras per environment to use when using a known task.",
64)
65
66parser.add_argument(
67 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
68)
69
70parser.add_argument(
71 "--autotune_max_percentage_util",
72 nargs="+",
73 type=float,
74 default=[100.0, 80.0, 80.0, 80.0],
75 required=False,
76 help=(
77 "The system utilization percentage thresholds to reach before an autotune is finished. "
78 "If any one of these limits are hit, the autotune stops."
79 "Thresholds are, in order, maximum CPU percentage utilization,"
80 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
81 "amd maximum GPU memory utilization."
82 ),
83)
84
85parser.add_argument(
86 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
87)
88
89parser.add_argument(
90 "--autotune_camera_count_interval",
91 type=int,
92 default=25,
93 help=(
94 "The number of cameras to try to add to the environment if the current camera count"
95 " falls within permitted system resource utilization limits."
96 ),
97)
98
99"""
100The following arguments are shared for when injecting cameras into a task environment,
101as well as when creating cameras independent of a task environment.
102"""
103
104parser.add_argument(
105 "--num_tiled_cameras",
106 type=int,
107 default=0,
108 required=False,
109 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
110)
111
112parser.add_argument(
113 "--num_standard_cameras",
114 type=int,
115 default=0,
116 required=False,
117 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
118)
119
120parser.add_argument(
121 "--num_ray_caster_cameras",
122 type=int,
123 default=0,
124 required=False,
125 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
126)
127
128parser.add_argument(
129 "--tiled_camera_data_types",
130 nargs="+",
131 type=str,
132 default=["rgb", "depth"],
133 help="The data types rendered by the tiled camera",
134)
135
136parser.add_argument(
137 "--standard_camera_data_types",
138 nargs="+",
139 type=str,
140 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
141 help="The data types rendered by the standard camera",
142)
143
144parser.add_argument(
145 "--ray_caster_camera_data_types",
146 nargs="+",
147 type=str,
148 default=["distance_to_image_plane"],
149 help="The data types rendered by the ray caster camera.",
150)
151
152parser.add_argument(
153 "--ray_caster_visible_mesh_prim_paths",
154 nargs="+",
155 type=str,
156 default=["/World/ground"],
157 help="WARNING: Ray Caster can currently only cast against a single, static, object",
158)
159
160parser.add_argument(
161 "--convert_depth_to_camera_to_image_plane",
162 action="store_true",
163 default=True,
164 help=(
165 "Enable undistorting from perspective view (distance to camera data_type)"
166 "to orthogonal view (distance to plane data_type) for depth."
167 "This is currently needed to create undisorted depth images/point cloud."
168 ),
169)
170
171parser.add_argument(
172 "--keep_raw_depth",
173 dest="convert_depth_to_camera_to_image_plane",
174 action="store_false",
175 help=(
176 "Disable undistorting from perspective view (distance to camera)"
177 "to orthogonal view (distance to plane data_type) for depth."
178 ),
179)
180
181parser.add_argument(
182 "--height",
183 type=int,
184 default=120,
185 required=False,
186 help="Height in pixels of cameras",
187)
188
189parser.add_argument(
190 "--width",
191 type=int,
192 default=140,
193 required=False,
194 help="Width in pixels of cameras",
195)
196
197parser.add_argument(
198 "--warm_start_length",
199 type=int,
200 default=3,
201 required=False,
202 help=(
203 "Number of steps to run the sim before starting benchmark."
204 "Needed to avoid blank images at the start of the simulation."
205 ),
206)
207
208parser.add_argument(
209 "--experiment_length",
210 type=int,
211 default=15,
212 required=False,
213 help="Number of steps to average over",
214)
215
216# This argument is only used when a task is not provided.
217parser.add_argument(
218 "--num_objects",
219 type=int,
220 default=10,
221 required=False,
222 help="Number of objects to spawn into the scene when not using a known task.",
223)
224
225
226AppLauncher.add_app_launcher_args(parser)
227args_cli = parser.parse_args()
228args_cli.enable_cameras = True
229
230if args_cli.autotune:
231 import pynvml
232
233if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
234 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
235# launch omniverse app
236app_launcher = AppLauncher(args_cli)
237simulation_app = app_launcher.app
238
239"""Rest everything follows."""
240
241import gymnasium as gym
242import numpy as np
243import random
244import time
245import torch
246
247import omni.isaac.core.utils.prims as prim_utils
248import psutil
249from omni.isaac.core.utils.stage import create_new_stage
250
251import omni.isaac.lab.sim as sim_utils
252from omni.isaac.lab.assets import RigidObject, RigidObjectCfg
253from omni.isaac.lab.scene.interactive_scene import InteractiveScene
254from omni.isaac.lab.sensors import (
255 Camera,
256 CameraCfg,
257 RayCasterCamera,
258 RayCasterCameraCfg,
259 TiledCamera,
260 TiledCameraCfg,
261 patterns,
262)
263from omni.isaac.lab.utils.math import convert_perspective_depth_to_orthogonal_depth, unproject_depth
264
265from omni.isaac.lab_tasks.utils import load_cfg_from_registry
266
267"""
268Camera Creation
269"""
270
271
272def create_camera_base(
273 camera_cfg: type[CameraCfg | TiledCameraCfg],
274 num_cams: int,
275 data_types: list[str],
276 height: int,
277 width: int,
278 prim_path: str | None = None,
279 instantiate: bool = True,
280) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
281 """Generalized function to create a camera or tiled camera sensor."""
282 # Determine prim prefix based on the camera class
283 name = camera_cfg.class_type.__name__
284
285 if instantiate:
286 # Create the necessary prims
287 for idx in range(num_cams):
288 prim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
289 if prim_path is None:
290 prim_path = f"/World/{name}_.*/{name}"
291 # If valid camera settings are provided, create the camera
292 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
293 cfg = camera_cfg(
294 prim_path=prim_path,
295 update_period=0,
296 height=height,
297 width=width,
298 data_types=data_types,
299 spawn=sim_utils.PinholeCameraCfg(
300 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
301 ),
302 )
303 if instantiate:
304 return camera_cfg.class_type(cfg=cfg)
305 else:
306 return cfg
307 else:
308 return None
309
310
311def create_tiled_cameras(
312 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
313) -> TiledCamera | None:
314 if data_types is None:
315 data_types = ["rgb", "depth"]
316 """Defines the tiled camera sensor to add to the scene."""
317 return create_camera_base(
318 camera_cfg=TiledCameraCfg,
319 num_cams=num_cams,
320 data_types=data_types,
321 height=height,
322 width=width,
323 )
324
325
326def create_cameras(
327 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
328) -> Camera | None:
329 """Defines the Standard cameras."""
330 if data_types is None:
331 data_types = ["rgb", "depth"]
332 return create_camera_base(
333 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
334 )
335
336
337def create_ray_caster_cameras(
338 num_cams: int = 2,
339 data_types: list[str] = ["distance_to_image_plane"],
340 mesh_prim_paths: list[str] = ["/World/ground"],
341 height: int = 100,
342 width: int = 120,
343 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
344 instantiate: bool = True,
345) -> RayCasterCamera | RayCasterCameraCfg | None:
346 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
347 for idx in range(num_cams):
348 prim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
349
350 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
351 cam_cfg = RayCasterCameraCfg(
352 prim_path=prim_path,
353 mesh_prim_paths=mesh_prim_paths,
354 update_period=0,
355 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
356 data_types=data_types,
357 debug_vis=False,
358 pattern_cfg=patterns.PinholeCameraPatternCfg(
359 focal_length=24.0,
360 horizontal_aperture=20.955,
361 height=480,
362 width=640,
363 ),
364 )
365 if instantiate:
366 return RayCasterCamera(cfg=cam_cfg)
367 else:
368 return cam_cfg
369
370 else:
371 return None
372
373
374def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
375 """Grab a simple tiled camera config for injecting into task environments."""
376 return create_camera_base(
377 TiledCameraCfg,
378 num_cams=args_cli.num_tiled_cameras,
379 data_types=args_cli.tiled_camera_data_types,
380 width=args_cli.width,
381 height=args_cli.height,
382 prim_path="{ENV_REGEX_NS}/" + prim_path,
383 instantiate=False,
384 )
385
386
387def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
388 """Grab a simple standard camera config for injecting into task environments."""
389 return create_camera_base(
390 CameraCfg,
391 num_cams=args_cli.num_standard_cameras,
392 data_types=args_cli.standard_camera_data_types,
393 width=args_cli.width,
394 height=args_cli.height,
395 prim_path="{ENV_REGEX_NS}/" + prim_path,
396 instantiate=False,
397 )
398
399
400def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
401 """Grab a simple ray caster config for injecting into task environments."""
402 return create_ray_caster_cameras(
403 num_cams=args_cli.num_ray_caster_cameras,
404 data_types=args_cli.ray_caster_camera_data_types,
405 width=args_cli.width,
406 height=args_cli.height,
407 prim_path="{ENV_REGEX_NS}/" + prim_path,
408 )
409
410
411"""
412Scene Creation
413"""
414
415
416def design_scene(
417 num_tiled_cams: int = 2,
418 num_standard_cams: int = 0,
419 num_ray_caster_cams: int = 0,
420 tiled_camera_data_types: list[str] | None = None,
421 standard_camera_data_types: list[str] | None = None,
422 ray_caster_camera_data_types: list[str] | None = None,
423 height: int = 100,
424 width: int = 200,
425 num_objects: int = 20,
426 mesh_prim_paths: list[str] = ["/World/ground"],
427) -> dict:
428 """Design the scene."""
429 if tiled_camera_data_types is None:
430 tiled_camera_data_types = ["rgb"]
431 if standard_camera_data_types is None:
432 standard_camera_data_types = ["rgb"]
433 if ray_caster_camera_data_types is None:
434 ray_caster_camera_data_types = ["distance_to_image_plane"]
435
436 # Populate scene
437 # -- Ground-plane
438 cfg = sim_utils.GroundPlaneCfg()
439 cfg.func("/World/ground", cfg)
440 # -- Lights
441 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
442 cfg.func("/World/Light", cfg)
443
444 # Create a dictionary for the scene entities
445 scene_entities = {}
446
447 # Xform to hold objects
448 prim_utils.create_prim("/World/Objects", "Xform")
449 # Random objects
450 for i in range(num_objects):
451 # sample random position
452 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
453 position *= np.asarray([1.5, 1.5, 0.5])
454 # sample random color
455 color = (random.random(), random.random(), random.random())
456 # choose random prim type
457 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
458 common_properties = {
459 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
460 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
461 "collision_props": sim_utils.CollisionPropertiesCfg(),
462 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
463 "semantic_tags": [("class", prim_type)],
464 }
465 if prim_type == "Cube":
466 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
467 elif prim_type == "Cone":
468 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
469 elif prim_type == "Cylinder":
470 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
471 # Rigid Object
472 obj_cfg = RigidObjectCfg(
473 prim_path=f"/World/Objects/Obj_{i:02d}",
474 spawn=shape_cfg,
475 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
476 )
477 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
478
479 # Sensors
480 standard_camera = create_cameras(
481 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
482 )
483 tiled_camera = create_tiled_cameras(
484 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
485 )
486 ray_caster_camera = create_ray_caster_cameras(
487 num_cams=num_ray_caster_cams,
488 data_types=ray_caster_camera_data_types,
489 mesh_prim_paths=mesh_prim_paths,
490 height=height,
491 width=width,
492 )
493 # return the scene information
494 if tiled_camera is not None:
495 scene_entities["tiled_camera"] = tiled_camera
496 if standard_camera is not None:
497 scene_entities["standard_camera"] = standard_camera
498 if ray_caster_camera is not None:
499 scene_entities["ray_caster_camera"] = ray_caster_camera
500 return scene_entities
501
502
503def inject_cameras_into_task(
504 task: str,
505 num_cams: int,
506 camera_name_prefix: str,
507 camera_creation_callable: Callable,
508 num_cameras_per_env: int = 1,
509) -> gym.Env:
510 """Loads the task, sticks cameras into the config, and creates the environment."""
511 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
512 cfg.sim.device = args_cli.device
513 cfg.sim.use_fabric = args_cli.use_fabric
514 scene_cfg = cfg.scene
515
516 num_envs = int(num_cams / num_cameras_per_env)
517 scene_cfg.num_envs = num_envs
518
519 for idx in range(num_cameras_per_env):
520 suffix = "" if idx == 0 else str(idx)
521 name = camera_name_prefix + suffix
522 setattr(scene_cfg, name, camera_creation_callable(name))
523 cfg.scene = scene_cfg
524 env = gym.make(task, cfg=cfg)
525 return env
526
527
528"""
529System diagnosis
530"""
531
532
533def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
534 """Get the maximum CPU, RAM, GPU utilization (processing), and
535 GPU memory usage percentages since the last time reset was true."""
536 if reset:
537 max_values[:] = [0, 0, 0, 0] # Reset the max values
538
539 # CPU utilization
540 cpu_usage = psutil.cpu_percent(interval=0.1)
541 max_values[0] = max(max_values[0], cpu_usage)
542
543 # RAM utilization
544 memory_info = psutil.virtual_memory()
545 ram_usage = memory_info.percent
546 max_values[1] = max(max_values[1], ram_usage)
547
548 # GPU utilization using pynvml
549 if torch.cuda.is_available():
550
551 if args_cli.autotune:
552 pynvml.nvmlInit() # Initialize NVML
553 for i in range(torch.cuda.device_count()):
554 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
555
556 # GPU Utilization
557 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
558 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
559 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
560
561 # GPU Memory Usage
562 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
563 gpu_memory_total = memory_info.total
564 gpu_memory_used = memory_info.used
565 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
566 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
567
568 pynvml.nvmlShutdown() # Shutdown NVML after usage
569 else:
570 gpu_processing_utilization_percent = None
571 gpu_memory_utilization_percent = None
572 return max_values
573
574
575"""
576Experiment
577"""
578
579
580def run_simulator(
581 sim: sim_utils.SimulationContext | None,
582 scene_entities: dict | InteractiveScene,
583 warm_start_length: int = 10,
584 experiment_length: int = 100,
585 tiled_camera_data_types: list[str] | None = None,
586 standard_camera_data_types: list[str] | None = None,
587 ray_caster_camera_data_types: list[str] | None = None,
588 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
589 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
590 convert_depth_to_camera_to_image_plane: bool = True,
591 max_cameras_per_env: int = 1,
592 env: gym.Env | None = None,
593) -> dict:
594 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
595
596 if tiled_camera_data_types is None:
597 tiled_camera_data_types = ["rgb"]
598 if standard_camera_data_types is None:
599 standard_camera_data_types = ["rgb"]
600 if ray_caster_camera_data_types is None:
601 ray_caster_camera_data_types = ["distance_to_image_plane"]
602
603 # Initialize camera lists
604 tiled_cameras = []
605 standard_cameras = []
606 ray_caster_cameras = []
607
608 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
609 for i in range(max_cameras_per_env):
610 # Extract tiled cameras
611 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
612 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
613 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
614
615 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
616 tiled_cameras.append(scene_entities[tiled_camera_key])
617 standard_cameras.append(scene_entities[standard_camera_key])
618 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
619 except KeyError:
620 break
621
622 # Initialize camera counts
623 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
624 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
625 labels = ["tiled", "standard", "ray_caster"]
626
627 if sim is not None:
628 # Set camera world poses
629 for camera_list in camera_lists:
630 for camera in camera_list:
631 num_cameras = camera.data.intrinsic_matrices.size(0)
632 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
633 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
634 camera.set_world_poses_from_view(positions, targets)
635
636 # Initialize timing variables
637 timestep = 0
638 total_time = 0.0
639 valid_timesteps = 0
640 sim_step_time = 0.0
641
642 while simulation_app.is_running() and timestep < experiment_length:
643 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
644 get_utilization_percentages()
645
646 # Measure the total simulation step time
647 step_start_time = time.time()
648
649 if sim is not None:
650 sim.step()
651
652 if env is not None:
653 with torch.inference_mode():
654 # compute zero actions
655 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
656 # apply actions
657 env.step(actions)
658
659 # Update cameras and process vision data within the simulation step
660 clouds = {}
661 images = {}
662 depth_images = {}
663
664 # Loop through all camera lists and their data_types
665 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
666 for cam_idx, camera in enumerate(camera_list):
667
668 if env is None: # No env, need to step cams manually
669 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
670 camera.update(dt=sim.get_physics_dt())
671
672 for data_type in data_types:
673 data_label = f"{label}_{cam_idx}_{data_type}"
674
675 if depth_predicate(data_type): # is a depth image, want to create cloud
676 depth = camera.data.output[data_type]
677 depth_images[data_label + "_raw"] = depth
678 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
679 depth = convert_perspective_depth_to_orthogonal_depth(
680 perspective_depth=camera.data.output[data_type],
681 intrinsics=camera.data.intrinsic_matrices,
682 )
683 depth_images[data_label + "_undistorted"] = depth
684
685 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
686 clouds[data_label] = pointcloud
687 else: # rgb image, just save it
688 image = camera.data.output[data_type]
689 images[data_label] = image
690
691 # End timing for the step
692 step_end_time = time.time()
693 sim_step_time += step_end_time - step_start_time
694
695 if timestep > warm_start_length:
696 get_utilization_percentages(reset=True)
697 total_time += step_end_time - step_start_time
698 valid_timesteps += 1
699
700 timestep += 1
701
702 # Calculate average timings
703 if valid_timesteps > 0:
704 avg_timestep_duration = total_time / valid_timesteps
705 avg_sim_step_duration = sim_step_time / experiment_length
706 else:
707 avg_timestep_duration = 0.0
708 avg_sim_step_duration = 0.0
709
710 # Package timing analytics in a dictionary
711 timing_analytics = {
712 "average_timestep_duration": avg_timestep_duration,
713 "average_sim_step_duration": avg_sim_step_duration,
714 "total_simulation_time": sim_step_time,
715 "total_experiment_duration": sim_step_time,
716 }
717
718 system_utilization_analytics = get_utilization_percentages()
719
720 print("--- Benchmark Results ---")
721 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
722 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
723 print(f"Total simulation time: {sim_step_time:.6f} seconds")
724 print("\nSystem Utilization Statistics:")
725 print(
726 f"| CPU:{system_utilization_analytics[0]}% | "
727 f"RAM:{system_utilization_analytics[1]}% | "
728 f"GPU Compute:{system_utilization_analytics[2]}% | "
729 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
730 )
731
732 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
733
734
735def main():
736 """Main function."""
737 # Load simulation context
738 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
739 raise ValueError("You must select at least one camera.")
740 if (
741 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
742 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
743 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
744 ):
745 print("[WARNING]: You have elected to use more than one camera type.")
746 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
747 print(
748 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
749 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
750 )
751 raise ValueError("Benchmark one camera at a time.")
752
753 print("[INFO]: Designing the scene")
754 if args_cli.task is None:
755 print("[INFO]: No task environment provided, creating random scene.")
756 sim_cfg = sim_utils.SimulationCfg(device="cpu" if args_cli.cpu else "cuda")
757 sim = sim_utils.SimulationContext(sim_cfg)
758 # Set main camera
759 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
760 scene_entities = design_scene(
761 num_tiled_cams=args_cli.num_tiled_cameras,
762 num_standard_cams=args_cli.num_standard_cameras,
763 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
764 tiled_camera_data_types=args_cli.tiled_camera_data_types,
765 standard_camera_data_types=args_cli.standard_camera_data_types,
766 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
767 height=args_cli.height,
768 width=args_cli.width,
769 num_objects=args_cli.num_objects,
770 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
771 )
772 # Play simulator
773 sim.reset()
774 # Now we are ready!
775 print("[INFO]: Setup complete...")
776 # Run simulator
777 run_simulator(
778 sim=sim,
779 scene_entities=scene_entities,
780 warm_start_length=args_cli.warm_start_length,
781 experiment_length=args_cli.experiment_length,
782 tiled_camera_data_types=args_cli.tiled_camera_data_types,
783 standard_camera_data_types=args_cli.standard_camera_data_types,
784 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
785 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
786 )
787 else:
788 print("[INFO]: Using known task environment, injecting cameras.")
789 autotune_iter = 0
790 max_sys_util_thresh = [0.0, 0.0, 0.0]
791 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
792 cur_num_cams = max_num_cams
793 cur_sys_util = max_sys_util_thresh
794 interval = args_cli.autotune_camera_count_interval
795
796 if args_cli.autotune:
797 max_sys_util_thresh = args_cli.autotune_max_percentage_util
798 max_num_cams = args_cli.autotune_max_camera_count
799 print("[INFO]: Auto tuning until any of the following threshold are met")
800 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
801 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
802 # Determine which camera is being tested...
803 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
804 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
805 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
806 camera_name_prefix = ""
807 camera_creation_callable = None
808 num_cams = 0
809 if tiled_camera_cfg is not None:
810 camera_name_prefix = "tiled_camera"
811 camera_creation_callable = create_tiled_camera_cfg
812 num_cams = args_cli.num_tiled_cameras
813 elif standard_camera_cfg is not None:
814 camera_name_prefix = "standard_camera"
815 camera_creation_callable = create_standard_camera_cfg
816 num_cams = args_cli.num_standard_cameras
817 elif ray_caster_camera_cfg is not None:
818 camera_name_prefix = "ray_caster_camera"
819 camera_creation_callable = create_ray_caster_camera_cfg
820 num_cams = args_cli.num_ray_caster_cameras
821
822 while (
823 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
824 and cur_num_cams <= max_num_cams
825 ):
826 cur_num_cams = num_cams + interval * autotune_iter
827 autotune_iter += 1
828
829 env = inject_cameras_into_task(
830 task=args_cli.task,
831 num_cams=cur_num_cams,
832 camera_name_prefix=camera_name_prefix,
833 camera_creation_callable=camera_creation_callable,
834 num_cameras_per_env=args_cli.task_num_cameras_per_env,
835 )
836 env.reset()
837 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
838 analysis = run_simulator(
839 sim=None,
840 scene_entities=env.unwrapped.scene,
841 warm_start_length=args_cli.warm_start_length,
842 experiment_length=args_cli.experiment_length,
843 tiled_camera_data_types=args_cli.tiled_camera_data_types,
844 standard_camera_data_types=args_cli.standard_camera_data_types,
845 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
846 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
847 max_cameras_per_env=args_cli.task_num_cameras_per_env,
848 env=env,
849 )
850
851 cur_sys_util = analysis["system_utilization_analytics"]
852 print("Triggering reset...")
853 env.close()
854 create_new_stage()
855 print("[INFO]: DONE! Feel free to CTRL + C Me ")
856 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
857 print("Keep in mind, this is without any training running on the GPU.")
858 print("Set lower utilization thresholds to account for training.")
859
860 if not args_cli.autotune:
861 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
862
863
864if __name__ == "__main__":
865 # run the main function
866 main()
867 # close sim app
868 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune
for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml
), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p source/standalone/tutorials/04_sensors/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop
and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless
)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL C.