skip to main content


Title: Agent-based model construction using inverse reinforcement learning
Agent-based modeling (ABM) assumes that behavioral rules affecting an agent's states and actions are known. However, discovering these rules is often challenging and requires deep insight about an agent's behaviors. Inverse reinforcement learning (IRL) can complement ABM by providing a systematic way to find behavioral rules from data. IRL frames learning behavioral rules as a problem of recovering motivations from observed behavior and generating rules consistent with these motivations. In this paper, we propose a method to construct an agent-based model directly from data using IRL. We explain each step of the proposed method and describe challenges that may occur during implementation. Our experimental results show that the proposed method can extract rules and construct an agent-based model with rich but concise behavioral rules for agents while still maintaining aggregate-level properties.  more » « less
Award ID(s):
1650512
NSF-PAR ID:
10053915
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2017 Winter Simulation Conference (WSC)
Page Range / eLocation ID:
1264-1275
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Agent navigation has been a crucial task in today's service and automated factories. Many efforts are to set specific rules for agents in a certain scenario to regulate the agent's behaviors. However, not all situations could be in advance considered, which might lead to terrible performance in a real-world application. In this paper, we propose CrowdGAIL, a method to learn from expert behaviors as an instructing policy, can train most 'human-like' agents in navigation problems without manually setting any reward function or beforehand regulations. First, the proposed model structure is based on generative adversarial imitation learning (GAIL), which imitates how humans take actions and move toward the target to a maximum extent, and by comparison, we prove the advantage of proximal policy optimization (PPO) to trust region policy optimization, thus, GAIL-PPO is what we base. Second, we design a special Sequential DemoBuffer compatible with the inner long short-term memory structure to apply spatiotemporal instruction on the agent's next step. Third, the paper demonstrates the potential of the model with an integrated social manner in a multi-agent scenario by considering human collision avoidance as well as social comfort distance. At last, experiments on the generated dataset from CrowdNav verify how close our model would act like a human being in the trajectory aspect and also how it could guide the multi-agents by avoiding any collision. Under the same evaluation metrics, CrowdGAIL shows better results compared with classic Social-GAN.

     
    more » « less
  2. In large agent-based models, it is difficult to identify the correlate system-level dynamics with individuallevel attributes. In this paper, we use inverse reinforcement learning to estimate compact representations of behaviors in large-scale pandemic simulations in the form of reward functions. We illustrate the capacity and performance of these representations identifying agent-level attributes that correlate with the emerging dynamics of large-scale multi-agent systems. Our experiments use BESSIE, an ABM for COVID-like epidemic processes, where agents make sequential decisions (e.g., use PPE/refrain from activities) based on observations (e.g., number of mask wearing people) collected when visiting locations to conduct their activities. The IRL-based reformulations of simulation outputs perform significantly better in classification of agent-level attributes than direct classification of decision trajectories and are thus more capable of determining agent-level attributes with definitive role in the collective behavior of the system. We anticipate that this IRL-based approach is broadly applicable to general ABMs. 
    more » « less
  3. null (Ed.)
    The nexus of food, energy, and water systems (FEWS) has become a salient research topic, as well as a pressing societal and policy challenge. Computational modeling is a key tool in addressing these challenges, and FEWS modeling as a subfield is now established. However, social dimensions of FEWS nexus issues, such as individual or social learning, technology adoption decisions, and adaptive behaviors, remain relatively underdeveloped in FEWS modeling and research. Agent-based models (ABMs) have received increasing usage recently in efforts to better represent and integrate human behavior into FEWS research. A systematic review identified 29 articles in which at least two food, energy, or water sectors were explicitly considered with an ABM and/or ABM-coupled modeling approach. Agent decision-making and behavior ranged from reactive to active, motivated by primarily economic objectives to multi-criteria in nature, and implemented with individual-based to highly aggregated entities. However, a significant proportion of models did not contain agent interactions, or did not base agent decision-making on existing behavioral theories. Model design choices imposed by data limitations, structural requirements for coupling with other simulation models, or spatial and/or temporal scales of application resulted in agent representations lacking explicit decision-making processes or social interactions. In contrast, several methodological innovations were also noted, which were catalyzed by the challenges associated with developing multi-scale, cross-sector models. Several avenues for future research with ABMs in FEWS research are suggested based on these findings. The reviewed ABM applications represent progress, yet many opportunities for more behaviorally rich agent-based modeling in the FEWS context remain. 
    more » « less
  4. Beecham, Roger ; Long, Jed A. ; Smith, Dianna ; Zhao, Qunshan ; Wise, Sarah (Ed.)
    Agent-based models (ABMs) are powerful tools used for better understanding, predicting, and responding to diseases. ABMs are well-suited to represent human health behaviors, a key driver of disease spread. However, many existing ABMs of infectious respiratory disease spread oversimplify or ignore behavioral aspects due to limited data and the variety of behavioral theories available. Therefore, this study aims to develop and implement a data-driven framework for agent decision-making related to health behaviors in geospatial ABMs of infectious disease spread. The agent decision-making framework uses a logistic regression model expressed in the form of odds ratios to calculate the probability of adopting a behavior. The framework is integrated into a geospatial ABM that simulates the spread of COVID-19 and mask usage among the student population at George Mason University in Fall 2021. The framework leverages odds ratios, which can be derived from surveys or open data, and can be modified to incorporate variables identified by behavioral theories. This advancement will offer the public and decision-makers greater insight into disease transmission, accurate predictions on disease outcomes, and preparation for future infectious disease outbreaks. 
    more » « less
  5. Most existing policy learning solutions require the learning agents to receive high-quality supervision signals, e.g., rewards in reinforcement learning (RL) or high-quality expert demonstrations in behavioral cloning (BC). These quality supervisions are either infeasible or prohibitively expensive to obtain in practice. We aim for a unified framework that leverages the available cheap weak supervisions to perform policy learning efficiently. To handle this problem, we treat the weak supervision'' as imperfect information coming from a peer agent, and evaluate the learning agent's policy based on a correlated agreement'' with the peer agent's policy (instead of simple agreements). Our approach explicitly punishes a policy for overfitting to the weak supervision. In addition to theoretical guarantees, extensive evaluations on tasks including RL with noisy reward, BC with weak demonstrations, and standard policy co-training (RL + BC) show that our method leads to substantial performance improvements, especially when the complexity or the noise of the learning environments is high. 
    more » « less