NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Task-Relevant Evaluation Metrics of Object Detection for Quantitative System-Level Analysis of Safety-Critical Autonomous Systems

https://doi.org/10.1145/3771284

Badithela, Apurva; Srivastav, Ranai; Wongpiromsarn, Tichakorn; Murray, Richard (October 2025, ACM Transactions on Cyber-Physical Systems)

In safety-critical robotic systems, perception is tasked with representing the environment to effectively guide decision-making and plays a crucial role in ensuring that the overall system meets its requirements. To quantitatively assess the impact of object detection and classification errors on system-level performance, we present a rigorous formalism for a model of detection error, and probabilistically reason about the satisfaction of regular-safety temporal logic requirements at the system level. We also show how standard evaluation metrics for object detection, such as confusion matrices, can be represented as models of detection error, which enables the computation of probabilistic satisfaction of system-level specifications. However, traditional confusion matrices treat all detections equally, without considering their relevance to the system-level task. To address this limitation, we propose novel evaluation metrics for object detection that are informed by both the system-level task and the downstream control logic, enabling a more context-appropriate evaluation of detection models. We identify logic-based formulas relevant to the downstream control and system-level specifications and use these formulas to define a logic-based evaluation metric for object detection and classification. These logic-based metrics result in less conservative assessments of system-level performance. Finally, we demonstrate our approach on a car-pedestrian example with a leaderboard PointPillars model evaluated on the nuScenes dataset, and validate probabilistic system-level guarantees in simulation.
more » « less
Free, publicly-accessible full text available October 15, 2026
Evaluation Metrics for Object Detection for Autonomous Systems

Badithela Apurva; Wongpiromsarn, Tichakorn; Murray, Richard M. (October 2023, 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS))

This paper studies the evaluation of learning-based object detection models in conjunction with model-checking of formal specifications defined on an abstract model of an autonomous system and its environment. In particular, we define two metrics – proposition-labeled and class-labeled confusion matrices – for evaluating object detection, and we incorporate these metrics to compute the satisfaction probability of system-level safety requirements. While confusion matrices have been effective for comparative evaluation of classification and object detection models, our framework fills two key gaps. First, we relate the performance of object detection to formal requirements defined over downstream high-level planning tasks. In particular, we provide empirical results that show that the choice of a good object detection algorithm, with respect to formal requirements on the overall system, significantly depends on the downstream planning and control design. Secondly, unlike the traditional confusion matrix, our metrics account for variations in performance with respect to the distance between the ego and the object being detected. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic (LTL).
more » « less
Full Text Available
Synthesizing Reactive Test Environments for Autonomous Systems: Testing Reach-Avoid Specifications with Multi-Commodity Flows

https://doi.org/10.1109/ICRA48891.2023.10160841

Badithela, Apurva; Graebener, Josefine B.; Ubellacker, Wyatt; Mazumdar, Eric V.; Ames, Aaron D.; Murray, Richard M. (May 2023, IEEE)

Full Text Available

Search for: All records