NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Passing the Driving Knowledge Test

Wei, M; Liu, W; Ohn-Bar, E (October 2025, ICCV)

Free, publicly-accessible full text available October 20, 2026
Scalable Offline Metrics for Autonomous Driving

Aich, A; Kulkarni, A; Ohn-Bar, E (October 2025, IROS)

Free, publicly-accessible full text available October 20, 2026
ZeroVO: Visual Odometry with Minimal Assumptions

Lai, L; Yin, Z; Ohn-Bar, E (June 2025, CVPR)

Free, publicly-accessible full text available June 20, 2026
Text to Blind Motion

Kim, H_J; Sengupta, K; Kuribayashi, M; Kacorri, H; Ohn-Bar, E (December 2024, NeurIPS)

Free, publicly-accessible full text available December 20, 2025
Motion Diversification Networks

Kim, H; Ohn-Bar, E (June 2024, Computer Vision and Pattern Recognition)

Full Text Available
Uncertainty-Guided Never-Ending Learning to Drive

Lai, L; Ohn-Bar, E; Arora, S; Yi, J (June 2024, Conference on Computer Vision and Pattern Recognition)

Full Text Available
Feedback-Guided Autonomous Driving

Zhang, J; Huang, Z; Ray, A; Ohn-Bar, E (June 2024, Computer Vision and Pattern Recognition)

Full Text Available
Learning to Drive Anywhere

Zhu, Ruizhao; Huang, Peng; Ohn-Bar, Eshed; Saligrama, Venkatesh (November 2023, Conference on Robot Learning)

Human drivers can seamlessly adapt their driving decisions across geographical locations with diverse conditions and rules of the road, e.g., left vs. right-hand traffic. In contrast, existing models for autonomous driving have been thus far only deployed within restricted operational domains, i.e., without accounting for varying driving behaviors across locations or model scalability. In this work, we propose AnyD, a single geographically-aware conditional imitation learning (CIL) model that can efficiently learn from heterogeneous and globally distributed data with dynamic environmental, traffic, and social characteristics. Our key insight is to introduce a high-capacity geo-location-based channel attention mechanism that effectively adapts to local nuances while also flexibly modeling similarities among regions in a data-driven manner. By optimizing a contrastive imitation objective, our proposed approach can efficiently scale across the inherently imbalanced data distributions and location-dependent events. We demonstrate the benefits of our AnyD agent across multiple datasets, cities, and scalable deployment paradigms, i.e., centralized, semi-supervised, and distributed agent training. Specifically, AnyD outperforms CIL baselines by over 14% in open-loop evaluation and 30% in closed-loop testing on CARLA.
more » « less
Full Text Available
Generalized Visual Odometry via Cross-Modal Self-Training

Lai, Lei; Shangguan, Zhongkai; Zhang, Jimuyang; Ohn-Bar, Eshed (October 2023, International Conference on Computer Vision)

We propose XVO, a semi-supervised learning method for training generalized monocular Visual Odometry (VO) models with robust off-the-self operation across diverse datasets and settings. In contrast to standard monocular VO approaches which often study a known calibration within a single dataset, XVO efficiently learns to recover relative pose with real-world scale from visual scene semantics, i.e., without relying on any known camera parameters. We optimize the motion estimation model via self-training from large amounts of unconstrained and heterogeneous dash camera videos available on YouTube. Our key contribution is twofold. First, we empirically demonstrate the benefits of semi-supervised training for learning a general-purpose direct VO regression network. Second, we demonstrate multi-modal supervision, including segmentation, flow, depth, and audio auxiliary prediction tasks, to facilitate generalized representations for the VO task. Specifically, we find audio prediction task to significantly enhance the semi-supervised learning process while alleviating noisy pseudo-labels, particularly in highly dynamic and out-of-domain video data. Our proposed teacher network achieves state-of-the-art performance on the commonly used KITTI benchmark despite no multi-frame optimization or knowledge of camera parameters. Combined with the proposed semi-supervised step, XVO demonstrates off-the-shelf knowledge transfer across diverse conditions on KITTI, nuScenes, and Argoverse without fine-tuning.
more » « less
Full Text Available
ASSISTER: Assistive Navigation via Conditional Instruction Generation

https://doi.org/10.1007/978-3-031-20059-5_16

Huang, Z.; Shangguan, Z. Zhang; Bar, G.; Boyd, M.; Ohn-Bar, E. (October 2022, European Conference on Computer Vision)

We introduce a novel vision-and-language navigation (VLN) task of learning to provide real-time guidance to a blind follower situated in complex dynamic navigation scenarios. Towards exploring real-time information needs and fundamental challenges in our novel modeling task, we first collect a multi-modal real-world benchmark with in-situ Orientation and Mobility (O&M) instructional guidance. Subsequently, we leverage the real-world study to inform the design of a larger-scale simulation benchmark, thus enabling comprehensive analysis of limitations in current VLN models. Motivated by how sighted O&M guides seamlessly and safely support the awareness of individuals with visual impairments when collaborating on navigation tasks, we present ASSISTER, an imitation-learned agent that can embody such effective guidance. The proposed assistive VLN agent is conditioned on navigational goals and commands for generating instructional sentences that are coherent with the surrounding visual scene, while also carefully accounting for the immediate assistive navigation task. Altogether, our introduced evaluation and training framework takes a step towards scalable development of the next generation of seamless, human-like assistive agents.
more » « less
Full Text Available

Search for: All records