With advancements in sensor technology, real-time monitoring of machine health conditions allows us to perform condition-based maintenance (CBM) for multi-unit systems. The maintenance decision of a unit is usually dependent on other units in a multi-unit system, inducing an exponentially large state space, which makes CBM of large multi-unit systems a very challenging engineering problem. In this work, we first propose two heuristic decision policies for multi-unit systems, namely the binary action policy and the -policy. Then we propose a multi-step lookahead rollout approach using the two heuristic policies to solve the challenging CBM problem. By applying the binary action policy, we can effectively reduce the action space and thus reduce the computational load in the rollout, while the -policy can be an excellent base policy for the rollout to improve upon. The theoretical gap between the proposed rollout approach and the optimal policy is also derived. The study further shows extensive experimentation to demonstrate the effectiveness of the proposed lookahead rollout approach for solving the CBM problem for small (3 and 5 units), medium (10 and 15 units), and large (20, 30, 40, and 50 units) scale systems.
more »
« less
Component-wise Markov decision process for solving condition-based maintenance of large multi-component systems with economic dependence
Condition-based maintenance of multi-component systems is a prevalent engineering problem due to its effectiveness in reducing the operational and maintenance costs of the system. However, developing the exact optimal maintenance decisions for the large multi-component system is computationally challenging, even not feasible, due to the exponential growth in system state and action space size with the number of components in the system. To address the scalability issue in CBM of large multi-component systems, we propose a Component-Wise Markov Decision Process(CW-MDP) and an Adjusted Component-Wise Markov Decision Process (ACW-MDP) to obtain an approximation of the optimal system-level CBM decision policy for large systems with heterogeneous components. We propose using an extended single-component action space to model the impact of system-level setup cost on a component-level solution. The theoretical gap between the proposed approach and system-level optima is also derived. Additionally, theoretical convergence and the relationship between ACW-MDP and CW-MDP are derived. The study further shows extensive numerical studies to demonstrate the effectiveness of component-wise solutions for solving large multi-component systems.
more »
« less
- Award ID(s):
- 2323082
- PAR ID:
- 10553801
- Publisher / Repository:
- Taylor & Francis
- Date Published:
- Journal Name:
- IISE Transactions
- ISSN:
- 2472-5854
- Page Range / eLocation ID:
- 1 to 14
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Often in manufacturing systems, scenarios arise where the demand for maintenance exceeds the capacity of maintenance resources. This results in the problem of allocating the limited resources among machines competing for them. This maintenance scheduling problem can be formulated as a Markov decision process (MDP) with the goal of finding the optimal dynamic maintenance action given the current system state. However, as the system becomes more complex, solving an MDP suffers from the curse of dimensionality. To overcome this issue, we propose a two-stage approach that first optimizes a static condition-based maintenance (CBM) policy using a genetic algorithm (GA) and then improves the policy online via Monte Carlo tree search (MCTS). The static policy significantly reduces the state space of the online problem by allowing us to ignore machines that are not sufficiently degraded. Furthermore, we formulate MCTS to seek a maintenance schedule that maximizes the long-term production volume of the system to reconcile the conflict between maintenance and production objectives. We demonstrate that the resulting online policy is an improvement over the static CBM policy found by GA.more » « less
-
When the operation and maintenance (O&M) of infrastructure components is modeled as a Markov Decision Process (MDP), the stochastic evolution following the optimal policy is completely described by a Markov transition matrix. This paper illustrates how to predict relevant features of the time evolution of these controlled components. We are interested in assessing if a critical state is reachable, in assessing the probability of reaching that state within a time period, of visiting that state before another, and in returning to that state. We present analytical methods to address these questions and discuss their computational complexity. Outcomes of these analyses can provide the decision makers with deeper understanding of the component evolution and suggest revising the control policy. We formulate the framework for MDPs and extend it to Partially Observable Markov Decision Processes (POMDPs).more » « less
-
We develop a general reinforcement learning framework for mean field control (MFC) problems. Such problems arise for instance as the limit of collaborative multi-agent control problems when the number of agents is very large. The asymptotic problem can be phrased as the optimal control of a non-linear dynamics. This can also be viewed as a Markov decision process (MDP) but the key difference with the usual RL setup is that the dynamics and the reward now depend on the state's probability distribution itself. Alternatively, it can be recast as a MDP on the Wasserstein space of measures. In this work, we introduce generic model-free algorithms based on the state-action value function at the mean field level and we prove convergence for a prototypical Q-learning method. We then implement an actor-critic method and report numerical results on two archetypal problems: a finite space model motivated by a cyber security application and a continuous space model motivated by an application to swarm motion.more » « less
-
The problem of allocating limited resources to maintain components of a multicomponent system, known as selective maintenance, is naturally formulated as a high-dimensional Markov decision process (MDP). Unfortunately, these problems are difficult to solve exactly for realistically sized systems. With this motivation, we contribute an approximate dynamic programming (ADP) algorithm for solving the selective maintenance problem for a series–parallel system with binary-state components. To the best of our knowledge, this paper describes the first application of ADP to maintain multicomponent systems. Our ADP is compared, using a numerical example from the literature, against exact solutions to the corresponding MDP. We then summarize the results of a more comprehensive set of experiments that demonstrate the ADP’s favorable performance on larger instances in comparison to both the exact (but computationally intensive) MDP approach and the heuristic (but computationally faster) one-step-lookahead approach. Finally, we demonstrate that the ADP is capable of solving an extension of the basic selective maintenance problem in which maintenance resources are permitted to be shared across stages.more » « less
An official website of the United States government

