Metrics Design#
Metrics evaluate robot performance and task completion. The system integrates with Isaac Lab’s recorder manager to capture simulation data and compute performance indicators.
Core Architecture#
Metrics use two-component architecture separating data collection from metric computation:
class MetricBase(ABC):
name: str
recorder_term_name: str
@abstractmethod
def get_recorder_term_cfg(self) -> RecorderTermCfg:
"""Define what data to record."""
@abstractmethod
def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> float:
"""Compute final metric from recorded data."""
Each metric has a RecorderTerm that collects data and a MetricBase implementation that processes recorded data into performance indicators.
Metrics in Detail#
- Data Collection Pipeline
Two-phase approach to performance evaluation:
RecorderTerm Components: Real-time data collection during simulation with configurable triggers
Recording Modes: Pre-reset, post-step, event-triggered, and continuous monitoring patterns
Storage Format: HDF5 format with episode organization and parallel environment support
Data Extraction: Access simulation state and extract relevant measurements
- Available Metrics
Built-in metrics for common evaluation scenarios:
Success Rate Metric: Binary task completion tracking across episodes
Door Moved Rate Metric: Interaction progress with openable objects via joint positions
Object Moved Rate Metric: Manipulation assessment through object velocity tracking
Custom Metrics: Extensible framework for task-specific performance indicators
- Integration Pattern
Metrics integrate through task definitions:
class OpenDoorTask(TaskBase): def get_metrics(self) -> list[MetricBase]: return [ SuccessRateMetric(), DoorMovedRateMetric(self.openable_object, reset_openness=self.reset_openness) ]
- Computation Workflow
Standardized evaluation process:
Data Recording: Capture relevant simulation data throughout execution
Episode Completion: Organize and store data when episodes terminate
Metric Computation: Post-simulation processing of recorded data
Result Aggregation: Combine multiple metrics into evaluation reports
Environment Integration#
# Metric collection during environment execution
env_builder = ArenaEnvBuilder(arena_environment, args)
env = env_builder.make_registered() # Metrics auto-configured from task
# Execute episodes with automatic recording
for episode in range(100):
obs, _ = env.reset()
done = False
while not done:
actions = policy(obs)
obs, _, terminated, truncated, _ = env.step(actions)
done = terminated or truncated
# Compute final performance indicators
metrics_results = compute_metrics(env)
Usage Examples#
Task-Specific Metrics
# Pick and place evaluation
task = PickAndPlaceTask(pick_object, destination, background)
metrics = task.get_metrics() # [SuccessRateMetric(), ObjectMovedRateMetric()]
# Door opening evaluation
task = OpenDoorTask(microwave, openness_threshold=0.8)
metrics = task.get_metrics() # [SuccessRateMetric(), DoorMovedRateMetric()]
Results Analysis
# Performance evaluation across environments
print(f"Success Rate: {metrics_results['success_rate']:.2%}")
print(f"Object Moved Rate: {metrics_results['object_moved_rate']:.2%}")