State-of-the-art lane detection methods use a variety of deep learning techniques for lane feature extraction and prediction, demonstrating better performance than conventional lane detectors. However, deep learning approaches are computationally demanding and often fail to meet real-time requirements of autonomous vehicles. This paper proposes a lane detection method using a light-weight convolutional neural network model as a feature extractor exploiting the potential of deep learning while meeting real-time needs. The developed model is trained with a dataset containing small image patches of dimension 16 × 64 pixels and a non-overlapping sliding window approach is employed to achieve fast inference. Then, the predictions are clustered and fitted with a polynomial to model the lane boundaries. The proposed method was tested on the KITTI and Caltech datasets and demonstrated an acceptable performance. We also integrated the detector into the localization and planning system of our autonomous vehicle and runs at 28 fps in a CPU on image resolution of 768 × 1024 meeting real-time requirements needed for self-driving cars.
more »
« less
Towards Driving-Oriented Metric for Lane Detection Models
After the 2017 TuSimple Lane Detection Challenge, its dataset and evaluation based on accuracy and F1 score have become the de facto standard to measure the performance of lane detection methods. While they have played a major role in improving the performance of lane detection methods, the validity of this evaluation method in downstream tasks has not been adequately researched. In this study, we design 2 new driving-oriented metrics for lane detection: End-to-End Lateral Deviation metric (E2E-LD) is directly formulated based on the requirements of autonomous driving, a core downstream task of lane detection; Per-frame Simulated Lateral Deviation metric (PSLD) is a lightweight surrogate metric of E2E-LD. To evaluate the validity of the metrics, we conduct a large-scale empirical study with 4 major types of lane detection approaches on the TuSimple dataset and our newly constructed dataset Comma2k19-LD. Our results show that the conventional metrics have strongly negative correlations (≤-0.55) with E2E-LD, meaning that some recent improvements purely targeting the conventional metrics may not have led to meaningful improvements in autonomous driving, but rather may actually have made it worse by overfitting to the conventional metrics. As autonomous driving is a security/safety-critical system, the underestimation of robustness hinders the sound development of practical lane detection models. We hope that our study will help the community achieve more downstream task-aware evaluations for lane detection.
more »
« less
- PAR ID:
- 10359468
- Date Published:
- Journal Name:
- IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In safety-critical robotic systems, perception is tasked with representing the environment to effectively guide decision-making and plays a crucial role in ensuring that the overall system meets its requirements. To quantitatively assess the impact of object detection and classification errors on system-level performance, we present a rigorous formalism for a model of detection error, and probabilistically reason about the satisfaction of regular-safety temporal logic requirements at the system level. We also show how standard evaluation metrics for object detection, such as confusion matrices, can be represented as models of detection error, which enables the computation of probabilistic satisfaction of system-level specifications. However, traditional confusion matrices treat all detections equally, without considering their relevance to the system-level task. To address this limitation, we propose novel evaluation metrics for object detection that are informed by both the system-level task and the downstream control logic, enabling a more context-appropriate evaluation of detection models. We identify logic-based formulas relevant to the downstream control and system-level specifications and use these formulas to define a logic-based evaluation metric for object detection and classification. These logic-based metrics result in less conservative assessments of system-level performance. Finally, we demonstrate our approach on a car-pedestrian example with a leaderboard PointPillars model evaluated on the nuScenes dataset, and validate probabilistic system-level guarantees in simulation.more » « less
-
RNN Tranducer (RNN-T) technology is very popular for building deployable models for end-to-end (E2E) automatic speech recognition (ASR) and spoken language understanding (SLU). Since these are E2E models operating on speech directly, there remains a potential to improve their performance using purely text based models like BERT, which have strong language understanding capabilities. In this paper, we propose a new training criteria for RNN-T based E2E ASR and SLU to transfer BERT’s knowledge into these systems. In the first stage of our proposed mechanism, we improve ASR performance by using a fine-grained, tokenwise knowledge transfer from BERT. In the second stage, we fine-tune the ASR model for SLU such that the above knowledge is explicitly utilized by the RNN-T model for improved performance. Our techniques improve ASR performance on the Switchboard and CallHome test sets of the NIST Hub5 2000 evaluation and on the recently released SLURP dataset on which we achieve a new state-of-the-art performance. For SLU, we show significant improvements on the SLURP slot filling task, outperforming HuBERT-base and reaching a performance close to HuBERTlarge. Compared to large transformer based speech models like HuBERT, our model is significantly more compact and uses only 300 hours of speech pretraining data.more » « less
-
The performance of object detection models in adverse weather conditions remains a critical challenge for intelligent transportation systems. Since advancements in autonomous driving rely heavily on extensive datasets, which help autonomous driving systems be reliable in complex driving environments, this study provides a comprehensive dataset under diverse weather scenarios like rain, haze, nighttime, or sun flares and systematically evaluates the robustness of state-of-the-art deep learning-based object detection frameworks. Our Adverse Driving Conditions Dataset features eight single weather effects and four challenging mixed weather effects, with a curated collection of 50,000 traffic images for each weather effect. State-of-the-art object detection models are evaluated using standard metrics, including precision, recall, and IoU. Our findings reveal significant performance degradation under adverse conditions compared to clear weather, highlighting common issues such as misclassification and false positives. For example, scenarios like haze combined with rain cause frequent detection failures, highlighting the limitations of current algorithms. Through comprehensive performance analysis, we provide critical insights into model vulnerabilities and propose directions for developing weather-resilient object detection systems. This work contributes to advancing robust computer vision technologies for safer and more reliable transportation in unpredictable real-world environments.more » « less
-
This paper studies the evaluation of learning-based object detection models in conjunction with model-checking of formal specifications defined on an abstract model of an autonomous system and its environment. In particular, we define two metrics – proposition-labeled and class-labeled confusion matrices – for evaluating object detection, and we incorporate these metrics to compute the satisfaction probability of system-level safety requirements. While confusion matrices have been effective for comparative evaluation of classification and object detection models, our framework fills two key gaps. First, we relate the performance of object detection to formal requirements defined over downstream high-level planning tasks. In particular, we provide empirical results that show that the choice of a good object detection algorithm, with respect to formal requirements on the overall system, significantly depends on the downstream planning and control design. Secondly, unlike the traditional confusion matrix, our metrics account for variations in performance with respect to the distance between the ego and the object being detected. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic (LTL).more » « less
An official website of the United States government

