skip to main content

Title: Approximate Dynamic Programming for Selective Maintenance in Series–Parallel Systems
The problem of allocating limited resources to maintain components of a multicomponent system, known as selective maintenance, is naturally formulated as a high-dimensional Markov decision process (MDP). Unfortunately, these problems are difficult to solve exactly for realistically sized systems. With this motivation, we contribute an approximate dynamic programming (ADP) algorithm for solving the selective maintenance problem for a series–parallel system with binary-state components. To the best of our knowledge, this paper describes the first application of ADP to maintain multicomponent systems. Our ADP is compared, using a numerical example from the literature, against exact solutions to the corresponding MDP. We then summarize the results of a more comprehensive set of experiments that demonstrate the ADP’s favorable performance on larger instances in comparison to both the exact (but computationally intensive) MDP approach and the heuristic (but computationally faster) one-step-lookahead approach. Finally, we demonstrate that the ADP is capable of solving an extension of the basic selective maintenance problem in which maintenance resources are permitted to be shared across stages.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE Transactions on Reliability
Page Range / eLocation ID:
1 to 18
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Often in manufacturing systems, scenarios arise where the demand for maintenance exceeds the capacity of maintenance resources. This results in the problem of allocating the limited resources among machines competing for them. This maintenance scheduling problem can be formulated as a Markov decision process (MDP) with the goal of finding the optimal dynamic maintenance action given the current system state. However, as the system becomes more complex, solving an MDP suffers from the curse of dimensionality. To overcome this issue, we propose a two-stage approach that first optimizes a static condition-based maintenance (CBM) policy using a genetic algorithm (GA) and then improves the policy online via Monte Carlo tree search (MCTS). The static policy significantly reduces the state space of the online problem by allowing us to ignore machines that are not sufficiently degraded. Furthermore, we formulate MCTS to seek a maintenance schedule that maximizes the long-term production volume of the system to reconcile the conflict between maintenance and production objectives. We demonstrate that the resulting online policy is an improvement over the static CBM policy found by GA. 
    more » « less
  2. Green wireless networks Wake-up radio Energy harvesting Routing Markov decision process Reinforcement learning 1. Introduction With 14.2 billions of connected things in 2019, over 41.6 billions expected by 2025, and a total spending on endpoints and services that will reach well over $1.1 trillion by the end of 2026, the Internet of Things (IoT) is poised to have a transformative impact on the way we live and on the way we work [1–3]. The vision of this ‘‘connected continuum’’ of objects and people, however, comes with a wide variety of challenges, especially for those IoT networks whose devices rely on some forms of depletable energy support. This has prompted research on hardware and software solutions aimed at decreasing the depen- dence of devices from ‘‘pre-packaged’’ energy provision (e.g., batteries), leading to devices capable of harvesting energy from the environment, and to networks – often called green wireless networks – whose lifetime is virtually infinite. Despite the promising advances of energy harvesting technologies, IoT devices are still doomed to run out of energy due to their inherent constraints on resources such as storage, processing and communica- tion, whose energy requirements often exceed what harvesting can provide. The communication circuitry of prevailing radio technology, especially, consumes relevant amount of energy even when in idle state, i.e., even when no transmissions or receptions occur. Even duty cycling, namely, operating with the radio in low energy consumption ∗ Corresponding author. E-mail address: (G. Koutsandria). (sleep) mode for pre-set amounts of time, has been shown to only mildly alleviate the problem of making IoT devices durable [4]. An effective answer to eliminate all possible forms of energy consumption that are not directly related to communication (e.g., idle listening) is provided by ultra low power radio triggering techniques, also known as wake-up radios [5,6]. Wake-up radio-based networks allow devices to remain in sleep mode by turning off their main radio when no communication is taking place. Devices continuously listen for a trigger on their wake-up radio, namely, for a wake-up sequence, to activate their main radio and participate to communication tasks. Therefore, devices wake up and turn their main radio on only when data communication is requested by a neighboring device. Further energy savings can be obtained by restricting the number of neighboring devices that wake up when triggered. This is obtained by allowing devices to wake up only when they receive specific wake-up sequences, which correspond to particular protocol requirements, including distance from the destina- tion, current energy status, residual energy, etc. This form of selective awakenings is called semantic addressing [7]. Use of low-power wake-up radio with semantic addressing has been shown to remarkably reduce the dominating energy costs of communication and idle listening of traditional radio networking [7–12]. This paper contributes to the research on enabling green wireless networks for long lasting IoT applications. Specifically, we introduce a ABSTRACT This paper presents G-WHARP, for Green Wake-up and HARvesting-based energy-Predictive forwarding, a wake-up radio-based forwarding strategy for wireless networks equipped with energy harvesting capabilities (green wireless networks). Following a learning-based approach, G-WHARP blends energy harvesting and wake-up radio technology to maximize energy efficiency and obtain superior network performance. Nodes autonomously decide on their forwarding availability based on a Markov Decision Process (MDP) that takes into account a variety of energy-related aspects, including the currently available energy and that harvestable in the foreseeable future. Solution of the MDP is provided by a computationally light heuristic based on a simple threshold policy, thus obtaining further computational energy savings. The performance of G-WHARP is evaluated via GreenCastalia simulations, where we accurately model wake-up radios, harvestable energy, and the computational power needed to solve the MDP. Key network and system parameters are varied, including the source of harvestable energy, the network density, wake-up radio data rate and data traffic. We also compare the performance of G-WHARP to that of two state-of-the-art data forwarding strategies, namely GreenRoutes and CTP-WUR. Results show that G-WHARP limits energy expenditures while achieving low end-to-end latency and high packet delivery ratio. Particularly, it consumes up to 34% and 59% less energy than CTP-WUR and GreenRoutes, respectively. 
    more » « less
  3. Abstract Adaptive mesh refinement (AMR) is the art of solving PDEs on a mesh hierarchy with increasing mesh refinement at each level of the hierarchy. Accurate treatment on AMR hierarchies requires accurate prolongation of the solution from a coarse mesh to a newly defined finer mesh. For scalar variables, suitably high-order finite volume WENO methods can carry out such a prolongation. However, classes of PDEs, such as computational electrodynamics (CED) and magnetohydrodynamics (MHD), require that vector fields preserve a divergence constraint. The primal variables in such schemes consist of normal components of the vector field that are collocated at the faces of the mesh. As a result, the reconstruction and prolongation strategies for divergence constraint-preserving vector fields are necessarily more intricate. In this paper we present a fourth-order divergence constraint-preserving prolongation strategy that is analytically exact. Extension to higher orders using analytically exact methods is very challenging. To overcome that challenge, a novel WENO-like reconstruction strategy is invented that matches the moments of the vector field in the faces, where the vector field components are collocated. This approach is almost divergence constraint-preserving, therefore, we call it WENO-ADP. To make it exactly divergence constraint-preserving, a touch-up procedure is developed that is based on a constrained least squares (CLSQ) method for restoring the divergence constraint up to machine accuracy. With the touch-up, it is called WENO-ADPT. It is shown that refinement ratios of two and higher can be accommodated. An item of broader interest in this work is that we have also been able to invent very efficient finite volume WENO methods, where the coefficients are very easily obtained and the multidimensional smoothness indicators can be expressed as perfect squares. We demonstrate that the divergence constraint-preserving strategy works at several high orders for divergence-free vector fields as well as vector fields, where the divergence of the vector field has to match a charge density and its higher moments. We also show that our methods overcome the late time instability that has been known to plague adaptive computations in CED. 
    more » « less
  4. Security of cyber-physical systems (CPS) continues to pose new challenges due to the tight integration and operational complexity of the cyber and physical components. To address these challenges, this article presents a domain-aware, optimization-based approach to determine an effective defense strategy for CPS in an automated fashion—by emulating a strategic adversary in the loop that exploits system vulnerabilities, interconnection of the CPS, and the dynamics of the physical components. Our approach builds on an adversarial decision-making model based on a Markov Decision Process (MDP) that determines the optimal cyber (discrete) and physical (continuous) attack actions over a CPS attack graph. The defense planning problem is modeled as a non-zero-sum game between the adversary and defender. We use a model-free reinforcement learning method to solve the adversary’s problem as a function of the defense strategy. We then employ Bayesian optimization (BO) to find an approximatebest-responsefor the defender to harden the network against the resulting adversary policy. This process is iterated multiple times to improve the strategy for both players. We demonstrate the effectiveness of our approach on a ransomware-inspired graph with a smart building system as the physical process. Numerical studies show that our method converges to a Nash equilibrium for various defender-specific costs of network hardening.

    more » « less
  5. In the aftermath of an extreme natural hazard, community residents must have access to functioning food retailers to maintain food security. Food security is dependent on supporting critical infrastructure systems, including electricity, potable water, and transportation. An understanding of the response of such interdependent networks and the process of post-disaster recovery is the cornerstone of an efficient emergency management plan. In this study, the interconnectedness among different critical facilities, such as electrical power networks, water networks, highway bridges, and food retailers, is modeled. The study considers various sources of uncertainty and complexity in the recovery process of a community to capture the stochastic behavior of the spatially distributed infrastructure systems. The study utilizes an approximate dynamic programming (ADP) framework to allocate resources to restore infrastructure components efficiently. The proposed ADP scheme enables us to identify near-optimal restoration decisions at the community level. Furthermore, we employ a simulated annealing (SA) algorithm to complement the proposed ADP framework and to identify near-optimal actions accurately. In the sequel, we use the City of Gilroy, California, USA to illustrate the applicability of the proposed methodology following a severe earthquake. The approach can be implemented efficiently to identify practical policy interventions to hasten recovery of food systems and to reduce adverse food-insecurity impacts for other hazards and communities. 
    more » « less