skip to main content


Title: FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis
Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorithmic contributions to both challenges. First, to generate a realistic, diverse set of opponents, we develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo. Second, we propose a distributionally robust bandit optimization procedure that adaptively adjusts risk aversion relative to uncertainty in beliefs about opponents’ behaviors. We rigorously quantify the tradeoffs in performance and robustness when approximating these computations in real-time motion-planning, and we demonstrate our methods experimentally on autonomous vehicles that achieve scaled speeds comparable to Formula One racecars.  more » « less
Award ID(s):
1925587
NSF-PAR ID:
10221875
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 37th International Conference on Machine Learning
Page Range / eLocation ID:
8992--9004
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fleets of networked autonomous vehicles (AVs) collect terabytes of sensory data, which is often transmitted to central servers (the “cloud”) for training machine learning (ML) models. Ideally, these fleets should upload all their data, especially from rare operating contexts, in order to train robust ML models. However, this is infeasible due to prohibitive network bandwidth and data labeling costs. Instead, we propose a cooperative data sampling strategy where geo-distributed AVs collaborate to collect a diverse ML training dataset in the cloud. Since the AVs have a shared objective but minimal information about each other’s local data distribution and perception model, we can naturally cast cooperative data collection as an 𝑁-player mathematical game. We show that our cooperative sampling strategy uses minimal information to converge to a centralized oracle policy with complete information about all AVs. Moreover, we theoretically characterize the performance benefits of our game-theoretic strategy compared to greedy sampling. Finally, we experimentally demonstrate that our method outperforms standard benchmarks by up to 21.9% on 4 perception datasets, including for autonomous driving in adverse weather conditions. Crucially, our experimental results on real-world datasets closely align with our theoretical guarantees. 
    more » « less
  2. Abstract

    Vehicle‐to‐Everything (V2X) communication has been proposed as a potential solution to improve the robustness and safety of autonomous vehicles by improving coordination and removing the barrier of non‐line‐of‐sight sensing. Cooperative Vehicle Safety (CVS) applications are tightly dependent on the reliability of the underneath data system, which can suffer from loss of information due to the inherent issues of their different components, such as sensors' failures or the poor performance of V2X technologies under dense communication channel load. Particularly, information loss affects the target classification module and, subsequently, the safety application performance. To enable reliable and robust CVS systems that mitigate the effect of information loss, a Context‐Aware Target Classification (CA‐TC) module coupled with a hybrid learning‐based predictive modeling technique for CVS systems is proposed. The CA‐TC consists of two modules: a Context‐Aware Map (CAM), and a Hybrid Gaussian Process (HGP) prediction system. Consequently, the vehicle safety applications use the information from the CA‐TC, making them more robust and reliable. The CAM leverages vehicles' path history, road geometry, tracking, and prediction; and the HGP is utilized to provide accurate vehicles' trajectory predictions to compensate for data loss (due to communication congestion) or sensor measurements' inaccuracies. Based on offline real‐world data, a finite bank of driver models that represent the joint dynamics of the vehicle and the drivers' behavior is learned. Offline training and online model updates are combined with on‐the‐fly forecasting to account for new possible driver behaviors. Finally, the framework is validated using simulation and realistic driving scenarios to confirm its potential in enhancing the robustness and reliability of CVS systems.

     
    more » « less
  3. Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes. 
    more » « less
  4. A resilient positioning, navigation, and timing (PNT) system is a necessity for the robust navigation of autonomous vehicles (AVs). A global navigation satellite system (GNSS) provides satellite-based PNT services. However, a spoofer can tamper the authentic GNSS signal and could transmit wrong position information to an AV. Therefore, an AV must have the capability of real-time detection of spoofing attacks related to PNT receivers, whereby it will help the end-user (the AV in this case) to navigate safely even if the GNSS is compromised. This paper aims to develop a deep reinforcement learning (RL)-based turn-by-turn spoofing attack detection method using low-cost in-vehicle sensor data. We have utilized the Honda Research Institute Driving Dataset to create attack and non-attack datasets to develop a deep RL model and have evaluated the performance of the deep RL-based attack detection model. We find that the accuracy of the deep RL model ranges from 99.99% to 100%, and the recall value is 100%. Furthermore, the precision ranges from 93.44% to 100%, and the f1 score ranges from 96.61% to 100%. Overall, the analyses reveal that the RL model is effective in turn-by-turn spoofing attack detection. 
    more » « less
  5. The paper discusses an intelligent vision-based control solution for autonomous tracking and landing of Vertical Take-Off and Landing (VTOL) capable Unmanned Aerial Vehicles (UAVs) on ships without utilizing GPS signal. The central idea involves automating the Navy helicopter ship landing procedure where the pilot utilizes the ship as the visual reference for long-range tracking; however, refers to a standardized visual cue installed on most Navy ships called the ”horizon bar” for the final approach and landing phases. This idea is implemented using a uniquely designed nonlinear controller integrated with machine vision. The vision system utilizes machine learning based object detection for long-range ship tracking and classical computer vision for the estimation of aircraft relative position and orientation utilizing the horizon bar during the final approach and landing phases. The nonlinear controller operates based on the information estimated by the vision system and has demonstrated robust tracking performance even in the presence of uncertainties. The developed autonomous ship landing system was implemented on a quad-rotor UAV equipped with an onboard camera, and approach and landing were successfully demonstrated on a moving deck, which imitates realistic ship deck motions. Extensive simulations and flight tests were conducted to demonstrate vertical landing safety, tracking capability, and landing accuracy. The video of the real-world experiments and demonstrations is available at this URL. 
    more » « less