skip to main content


Search for: All records

Creators/Authors contains: "Beling, Peter"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In large agent-based models, it is difficult to identify the correlate system-level dynamics with individuallevel attributes. In this paper, we use inverse reinforcement learning to estimate compact representations of behaviors in large-scale pandemic simulations in the form of reward functions. We illustrate the capacity and performance of these representations identifying agent-level attributes that correlate with the emerging dynamics of large-scale multi-agent systems. Our experiments use BESSIE, an ABM for COVID-like epidemic processes, where agents make sequential decisions (e.g., use PPE/refrain from activities) based on observations (e.g., number of mask wearing people) collected when visiting locations to conduct their activities. The IRL-based reformulations of simulation outputs perform significantly better in classification of agent-level attributes than direct classification of decision trajectories and are thus more capable of determining agent-level attributes with definitive role in the collective behavior of the system. We anticipate that this IRL-based approach is broadly applicable to general ABMs. 
    more » « less
  2. ABSTRACT

    Modern engineered systems, and learning‐based systems, in particular, provide unprecedented complexity that requires advancement in our methods to achieve confidence in mission success through test and evaluation (T&E). We define learning‐based systems as engineered systems that incorporate a learning algorithm (artificial intelligence) component of the overall system. A part of the unparalleled complexity is the rate at which learning‐based systems change over traditional engineered systems. Where traditional systems are expected to steadily decline (change) in performance due to time (aging), learning‐based systems undergo a constant change which must be better understood to achieve high confidence in mission success. To this end, we propose pairing Bayesian methods with systems theory to quantify changes in operational conditions, changes in adversarial actions, resultant changes in the learning‐based system structure, and resultant confidence measures in mission success. We provide insights, in this article, into our overall goal and progress toward developing a framework for evaluation through an understanding of equivalence of testing.

     
    more » « less
  3. Real-time control of stormwater systems can reduce flooding and improve water quality. Current industry real-time control strategies use simple rules based on water quantity parameters at a local scale. However, system-level control methods that also incorporate observations of water quality could provide improved control and performance. Therefore, the objective of this research is to evaluate the impact of local and system-level control approaches on flooding and sediment-related water quality in a stormwater system within the flood-prone coastal city of Norfolk, Virginia, USA. Deep reinforcement learning (RL), an emerging machine learning technique, is used to learn system-level control policies that attempt to balance flood mitigation and treatment of sediment. RL is compared to the conventional stormwater system and two methods of local-scale rule-based control: (i) industry standard predictive rule-based control with a fixed detention time and (ii) rules based on water quality observations. For the studied system, both methods of rule-based control improved water quality compared to the passive system, but increased total system flooding due to uncoordinated releases of stormwater. An RL agent learned controls that maintained target pond levels while reducing total system flooding by 4% compared to the passive system. When pre-trained from the RL agent that learned to reduce flooding, another RL agent was able to learn to decrease TSS export by an average of 52% compared to the passive system and with an average of 5% less flooding than the rule-based control methods. As the complexity of stormwater RTC implementations grows and climate change continues, system-level control approaches such as the RL used here will be needed to help mitigate flooding and protect water quality. 
    more » « less
  4. Compared with capital improvement projects, real-time control of stormwater systems may be a more effective and efficient approach to address the increasing risk of flooding in urban areas. One way to automate the design process of control policies is through reinforcement learning (RL). Recently, RL methods have been applied to small stormwater systems and have demonstrated better performance over passive systems and simple rule-based strategies. However, it remains unclear how effective RL methods are for larger and more complex systems. Current RL-based control policies also suffer from poor convergence and stability, which may be due to large updates made by the underlying RL algorithm. In this study, we use the Proximal Policy Optimization (PPO) algorithm and develop control policies for a medium-sized stormwater system that can significantly mitigate flooding during large storm events. Our approach demonstrates good convergence behavior and stability, and achieves robust out-of-sample performance. 
    more » « less
  5. The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC. 
    more » « less
  6. The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance on the StarCraft II micromanagement testbed, a common MARL benchmark. However, our experiments demonstrate that, in some cases, QMIX performs sub-optimally with the A2C framework, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critic methods that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critic (VDAC). We evaluate VDAC on the StarCraft II micromanagement task and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore, we use a set of ablation experiments to identify the key factors that contribute to the performance of VDAC. 
    more » « less
  7. null (Ed.)
  8. The ability to model and predict ego-vehicle's surrounding traffic is crucial for autonomous pilots and intelligent driver-assistance systems. Acceleration prediction is important as one of the major components of traffic prediction. This paper proposes novel approaches to the acceleration prediction problem. By representing spatial relationships between vehicles with a graph model, we build a generalized acceleration prediction framework. This paper studies the effectiveness of proposed Graph Convolution Networks, which operate on graphs predicting the acceleration distribution for vehicles driving on highways. We further investigate prediction improvement through integrating of Recurrent Neural Networks to disentangle the temporal complexity inherent in the traffic data. Results from simulation with comprehensive performance metrics support that our proposed networks outperform state-of-the-art methods in generating realistic trajectories over a prediction horizon. 
    more » « less
  9. null (Ed.)
    As the number of Internet of Things (IoT) devices continues to increase, energy-harvesting (EH) devices eliminate the need to replace batteries or find outlets for sensors in indoor environments. This comes at a cost, however, as these energy-harvesting devices introduce new failure modes not present in traditional IoT devices: extended periods of no harvestable energy cause them to go dormant, their often simple wireless protocols are unreliable, and their limited energy reserves prohibit many diagnostic features. While energy-harvesting sensors promise easy-to-setup and maintenance-free deployments, their limitations hinder robust, long-term data collection. To continuously monitor and maintain a network of energy-harvesting devices in buildings, we propose the EH-HouseKeeper. EH-HouseKeeper is a data-driven system that monitors EH device compliance and predicts healthy signal zones in a building based on the existing gateway location(s) and building profile for easier device maintenance. EH-HouseKeeper does this by first filtering excess event-triggered data points and applying representation learning on building features that describe the path between the gateways and the device. We assessed EH-HouseKeeper by deploying 125 energy-harvesting sensors of varying types in a 17,000 square foot research infrastructure, randomly masking a quarter of the sensors as the test set for validation. The results of our 6-month data-collection period demonstrate an average greater than 80% accuracy in predicting the health status of the subset. Our results validate techniques for assessing sensor health status across device types, for inferring gateway status, and approaches to assist in identifying between gateway, transmission, and sensor faults. 
    more » « less
  10. null (Ed.)
    Flooding in many areas is becoming more prevalent due to factors such as urbanization and climate change, requiring modernization of stormwater infrastructure. Retrofitting standard passive systems with controllable valves/pumps is promising, but requires real-time control (RTC). One method of automating RTC is reinforcement learning (RL), a general technique for sequential optimization and control in uncertain environments. The notion is that an RL algorithm can use inputs of real-time flood data and rainfall forecasts to learn a policy for controlling the stormwater infrastructure to minimize measures of flooding. In real-world conditions, rainfall forecasts and other state information are subject to noise and uncertainty. To account for these characteristics of the problem data, we implemented Deep Deterministic Policy Gradient (DDPG), an RL algorithm that is distinguished by its capability to handle noise in the input data. DDPG implementations were trained and tested against a passive flood control policy. Three primary cases were studied: (i) perfect data, (ii) imperfect rainfall forecasts, and (iii) imperfect water level and forecast data. Rainfall episodes (100) that caused flooding in the passive system were selected from 10 years of observations in Norfolk, Virginia, USA; 85 randomly selected episodes were used for training and the remaining 15 unseen episodes served as test cases. Compared to the passive system, all RL implementations reduced flooding volume by 70.5% on average, and performed within a range of 5%. This suggests that DDPG is robust to noisy input data, which is essential knowledge to advance the real-world applicability of RL for stormwater RTC. 
    more » « less