skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A sequential decision making prospective on resilience
We investigate how sequential decision making analysis can be used for modeling system resilience. In the aftermath of an extreme event, agents involved in the emergency management aim at an optimal recovery process, trading off the loss due to lack of system functionality with the investment needed for a fast recovery. This process can be formulated as a sequential decision-making optimization problem, where the overall loss has to be minimized by adopting an appropriate policy, and dynamic programming applied to Markov Decision Processes (MDPs) provides a rational and computationally feasible framework for a quantitative analysis. The paper investigates how trends of post-event loss and recovery can be understood in light of the sequential decision making framework. Specifically, it is well known that system’s functionality is often taken to a level different from that before the event: this can be the result of budget constraints and/or economic opportunity, and the framework has the potential of integrating these considerations. But we focus on the specific case of an agent learning something new about the process, and reacting by updating the target functionality level of the system. We illustrate how this can happen in a simplified setting, by using Hidden-Model MPDs (HM-MDPs) for modelling the management of a set of components under model uncertainty. When an extreme event occurs, the agent updates the hazard model and, consequently, her response and long-term planning.  more » « less
Award ID(s):
1638327
PAR ID:
10065508
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Safety, Reliability, Risk, Resilience and Sustainability of Structures and Infrastructure 12th Int. Conf. on Structural Safety and Reliability, Vienna, Austria, 6–10 August 2017
Volume:
1
Page Range / eLocation ID:
2633-2640
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The functioning of interdependent civil infrastructure systems in the aftermath of a disruptive event is critical to the performance and vitality of any modern urban community. Post-event stressors and chaotic circumstances, time limitations, and complexities in the community recovery process highlight the necessity for a comprehensive decision-making framework at the community-level for post-event recovery management. Such a framework must be able to handle large-scale scheduling and decision processes, which involve difficult control problems with large combinatorial decision spaces. This study utilizes approximate dynamic programming algorithms along with heuristics for the identification of optimal community recovery actions following the occurrence of an extreme earthquake event. The proposed approach addresses the curse of dimensionality in its analysis and management of multi-state, large-scale infrastructure systems. Furthermore, the proposed approach can consider the cur-rent recovery policies of responsible public and private entities within the community and shows how their performance might be improved. A testbed community coarsely modeled after Gilroy, California, is utilized as an illustrative example. While the illustration provides optimal policies for the Electrical Power Network serving Gilroy following a severe earthquake, preliminary work shows that the methodology is computationally well suited to other infrastructure systems and hazards. 
    more » « less
  2. The functioning of interdependent civil infrastructure systems in the aftermath of a disruptive event is critical to the performance and vitality of any modern urban community. Post-event stressors and chaotic circumstances, time limitations, and complexities in the community recovery process highlight the necessity for a comprehensive decision-making framework at the community-level for post-event recovery management. Such a framework must be able to handle large-scale scheduling and decision processes, which involve difficult control problems with large combinatorial decision spaces. This study utilizes approximate dynamic programming algorithms along with heuristics for the identification of optimal community recovery actions following the occurrence of an extreme earthquake event. The proposed approach addresses the curse of dimensionality in its analysis and management of multi-state, large-scale infrastructure systems. Furthermore, the proposed approach can consider the cur-rent recovery policies of responsible public and private entities within the community and shows how their performance might be improved. A testbed community coarsely modeled after Gilroy, California, is utilized as an illustrative example. While the illustration provides optimal policies for the Electrical Power Network serving Gilroy following a severe earthquake, preliminary work shows that the methodology is computationally well suited to other infrastructure systems and hazards. 
    more » « less
  3. vanBerkel, Kees; Ciabattoni, Agata; Horty, John (Ed.)
    Markov Decision Processes (MDPs) are the most common model for decision making under uncertainty in the Machine Learning community. An MDP captures nondeterminism, probabilistic uncertainty, and an explicit model of action. A Reinforcement Learning (RL) agent learns to act in an MDP by maximizing a utility function. This paper considers the problem of learning a decision policy that maximizes utility subject to satisfying a constraint expressed in deontic logic. In this setup, the utility captures the agent’s mission - such as going quickly from A to B. The deontic formula represents (ethical, social, situational) constraints on how the agent might achieve its mission by prohibiting classes of behaviors. We use the logic of Expected Act Utilitarianism, a probabilistic stit logic that can be interpreted over controlled MDPs. We develop a variation on policy improvement, and show that it reaches a constrained local maximum of the mission utility. Given that in stit logic, an agent’s duty is derived from value maximization, this can be seen as a way of acting to simultaneously maximize two value functions, one of which is implicit, in a bi-level structure. We illustrate these results with experiments on sample MDPs. 
    more » « less
  4. Abstract To be responsive to dynamically changing real-world environments, an intelligent agent needs to perform complex sequential decision-making tasks that are often guided by commonsense knowledge. The previous work on this line of research led to the framework called interleaved commonsense reasoning and probabilistic planning (i corpp ), which used P-log for representing commmonsense knowledge and Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs) for planning under uncertainty. A main limitation of i corpp is that its implementation requires non-trivial engineering efforts to bridge the commonsense reasoning and probabilistic planning formalisms. In this paper, we present a unified framework to integrate i corpp ’s reasoning and planning components. In particular, we extend probabilistic action language pBC + to express utility, belief states, and observation as in POMDP models. Inheriting the advantages of action languages, the new action language provides an elaboration tolerant representation of POMDP that reflects commonsense knowledge. The idea led to the design of the system pbcplus2pomdp , which compiles a pBC + action description into a POMDP model that can be directly processed by off-the-shelf POMDP solvers to compute an optimal policy of the pBC + action description. Our experiments show that it retains the advantages of i corpp while avoiding the manual efforts in bridging the commonsense reasoner and the probabilistic planner. 
    more » « less
  5. Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algorithms achieve data efficiency by leveraging a key insight: having samples of the exogenous variables, past decisions can be revisited in hindsight to infer counterfactual consequences that can accelerate policy improvements. We compare HL against classic baselines in the multi-secretary and airline revenue management problems. We also scale our algorithms to a business-critical cloud resource management problem – allocating Virtual Machines (VMs) to physical machines, and simulate their performance with real datasets from a large public cloud provider. We find that HL algorithms outperform domain-specific heuristics, as well as state-of-the-art reinforcement learning methods. 
    more » « less