skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 9:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.


Search for: All records

Award ID contains: 1849246

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. With the development of sensing and communica- tion technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodolo- gies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging to mathematically characterize the improvement of the performance of CAVs with commu- nication and cooperation capability. When each individual autonomous vehicle is originally self-interest, we can not assume that all agents would cooperate naturally during the training process. In this work, we propose to reallocate the system’s total reward efficiently to motivate stable cooperation among autonomous vehicles. We formally define and quantify how to reallocate the system’s total reward to each agent under the proposed transferable utility game, such that communication- based cooperation among multi-agents increases the system’s total reward. We prove that Shapley value-based reward reallocation of MARL locates in the core if the transferable utility game is a convex game. Hence, the cooperation is stable and efficient and the agents should stay in the coalition or the cooperating group. We then propose a cooperative policy learning algorithm with Shapley value reward reallocation. In experiments, compared with several literature algorithms, we show the improvement of the mean episode system reward of CAV systems using our proposed algorithm. 
    more » « less
  2. null (Ed.)
  3. null (Ed.)
  4. null (Ed.)
  5. As communication technologies develop, an au- tonomous vehicle will receive information not only from its own sensing system but also from infrastructures and other vehicles through communication. This paper discusses how to exploit a sequence of future information that is shared among autonomous vehicles, including the planned positions, the velocities and the lane numbers. A hybrid system model is constructed, and a control policy is designed to utilize shared sequence information for making navigation decisions. For the high-level discrete state transitions, the shared information is used to determine when to change lane, if lane changing will bring reward for the autonomous vehicle and there exists a feasible continuous state controller. For the low-level continuous state space controller generation, the shared information can relax the safety interval constraints in the existing model predictive control method. In the system level, the information sharing can increase the traffic flow and improve driving comfort. We demonstrate the advantages of information sharing in control and navigation in simulation. 
    more » « less
  6. As Autonomous Vehicles (AVs) become possible for E-hailing services operate, especially when telecom companies start deploying next-generation wireless networks (known as 5G), many new technologies may be applied in these vehicles. Dynamic-route-switching is one of these technologies, which could help vehicles find the best possible route based on real-time traffic information. However, allowing all AVs to choose their own optimal routes is not the best solution for a complex city network, since each vehicle ignores its negative effect on the road system due to the additional congestion it creates. As a result, with this system, some of the links may become over-congested, causing the whole road network system performance to degrade. Meanwhile, the travel time reliability, especially during the peak hours, is an essential factor to improve the customers' ride experience. Unfortunately, these two issues have received relatively less attention. In this paper, we design a link-based dynamic pricing model to improve the road network system and travel time reliability at the same time. In this approach, we assume that all links are eligible with the dynamic pricing, and AVs will be perfect informed with update traffic condition and follow the dynamic road pricing. A heuristic approach is developed to address this computationally difficult problem. The output includes link-based surcharge, new travel demand and traffic condition which would improve the system performance close to the System Optimal (SO) solution and maintain the travel time reliability. Finally, we evaluate the effectiveness and efficiency of the proposed model to the well-known test Sioux Falls network. 
    more » « less