skip to main content


Search for: All records

Creators/Authors contains: "Su, Jianyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC. 
    more » « less
  2. The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC. 
    more » « less
  3. The ability to model and predict ego-vehicle's surrounding traffic is crucial for autonomous pilots and intelligent driver-assistance systems. Acceleration prediction is important as one of the major components of traffic prediction. This paper proposes novel approaches to the acceleration prediction problem. By representing spatial relationships between vehicles with a graph model, we build a generalized acceleration prediction framework. This paper studies the effectiveness of proposed Graph Convolution Networks, which operate on graphs predicting the acceleration distribution for vehicles driving on highways. We further investigate prediction improvement through integrating of Recurrent Neural Networks to disentangle the temporal complexity inherent in the traffic data. Results from simulation with comprehensive performance metrics support that our proposed networks outperform state-of-the-art methods in generating realistic trajectories over a prediction horizon. 
    more » « less
  4. null (Ed.)
    As the number of Internet of Things (IoT) devices continues to increase, energy-harvesting (EH) devices eliminate the need to replace batteries or find outlets for sensors in indoor environments. This comes at a cost, however, as these energy-harvesting devices introduce new failure modes not present in traditional IoT devices: extended periods of no harvestable energy cause them to go dormant, their often simple wireless protocols are unreliable, and their limited energy reserves prohibit many diagnostic features. While energy-harvesting sensors promise easy-to-setup and maintenance-free deployments, their limitations hinder robust, long-term data collection. To continuously monitor and maintain a network of energy-harvesting devices in buildings, we propose the EH-HouseKeeper. EH-HouseKeeper is a data-driven system that monitors EH device compliance and predicts healthy signal zones in a building based on the existing gateway location(s) and building profile for easier device maintenance. EH-HouseKeeper does this by first filtering excess event-triggered data points and applying representation learning on building features that describe the path between the gateways and the device. We assessed EH-HouseKeeper by deploying 125 energy-harvesting sensors of varying types in a 17,000 square foot research infrastructure, randomly masking a quarter of the sensors as the test set for validation. The results of our 6-month data-collection period demonstrate an average greater than 80% accuracy in predicting the health status of the subset. Our results validate techniques for assessing sensor health status across device types, for inferring gateway status, and approaches to assist in identifying between gateway, transmission, and sensor faults. 
    more » « less