NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Value-Decomposition Multi-Agent Actor-Critics

Su, Jianyu; Adams, Stephen; Beling, Peter (February 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC.
more » « less
Full Text Available
Value-decomposition multi-agent actor-critics

Su, Jianyu; Adams, Stephen; Beling, Peter (February 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC.
more » « less
Full Text Available
Generative adversarial networks for data augmentation and transfer in credit card fraud detection

https://doi.org/10.1080/01605682.2021.1880296

Langevin, Alex; Cody, Tyler; Adams, Stephen; Beling, Peter (January 2021, Journal of the Operational Research Society)
null (Ed.)
Full Text Available
Graph Convolution Networks for Probabilistic Modeling of Driving Acceleration

https://doi.org/10.1109/ITSC45102.2020.9294294

Su, Jianyu; Beling, Peter; Guo, Rui; Han, Kyungtae (December 2020, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC))

The ability to model and predict ego-vehicle's surrounding traffic is crucial for autonomous pilots and intelligent driver-assistance systems. Acceleration prediction is important as one of the major components of traffic prediction. This paper proposes novel approaches to the acceleration prediction problem. By representing spatial relationships between vehicles with a graph model, we build a generalized acceleration prediction framework. This paper studies the effectiveness of proposed Graph Convolution Networks, which operate on graphs predicting the acceleration distribution for vehicles driving on highways. We further investigate prediction improvement through integrating of Recurrent Neural Networks to disentangle the temporal complexity inherent in the traffic data. Results from simulation with comprehensive performance metrics support that our proposed networks outperform state-of-the-art methods in generating realistic trajectories over a prediction horizon.
more » « less
Full Text Available
Modeling law search as prediction

https://doi.org/10.1007/s10506-020-09261-5

Dadgostari, Faraz; Guim, Mauricio; Beling, Peter A.; Livermore, Michael A.; Rockmore, Daniel N. (January 2020, Artificial Intelligence and Law)

Full Text Available
A Spatially and Temporally Attentive Joint Trajectory Prediction Framework for Modeling Vessel Intent

Sekhon, Jasmine; Fleming, Cody (January 2020, Proceedings of the 2nd Conference on Learning for Dynamics and Control)
null (Ed.)
Full Text Available
Data-driven simulation for energy consumption estimation in a smart home

https://doi.org/10.1007/s10669-019-09727-1

Adams, Stephen; Greenspan, Steven; Velez-Rojas, Maria; Mankovski, Serge; Beling, Peter A. (September 2019, Environment Systems and Decisions)

Full Text Available
4 Dimensional waypoint generation for conflict-free trajectory based operation

https://doi.org/10.1016/j.ast.2019.03.035

Sun, Minghui; Rand, Krista; Fleming, Cody (May 2019, Aerospace Science and Technology)

Full Text Available
Improving Credit Card Fraud Detection by Profiling and Clustering Accounts

https://doi.org/10.1109/SIEDS.2019.8735623

Kasa, Navin; Dahbura, Andrew; Ravoori, Charishma; Adams, Stephen (April 2019, 2019 Systems and Information Engineering Design Symposium (SIEDS))

Full Text Available
Towards Improved Testing For Deep Learning

Sekhon, Jasmine; Fleming, Cody (January 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER))

Full Text Available

« Prev Next »

Search for: All records