skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal scheduling of multiple sensors which transmit measurements over a dynamic lossy network
Motivated by various distributed control applications, we consider a linear system with Gaussian noise observed by multiple sensors which transmit measurements over a dynamic lossy network. We characterize the stationary optimal sensor scheduling policy for the finite horizon, discounted, and long-term average cost problems and show that the value iteration algorithm converges to a solution of the average cost problem. We further show that the suboptimal policies provided by the rolling horizon truncation of the value iteration also guarantee geometric ergodicity and provide near-optimal average cost. Lastly, we provide qualitative characterizations of the multidimensional set of measurement loss rates for which the system is stabilizable for a static network, significantly extending earlier results on intermittent observations.  more » « less
Award ID(s):
1715210
PAR ID:
10139611
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2019 IEEE 58th Conference on Decision and Control (CDC)
Page Range / eLocation ID:
684 to 689
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an ε-optimal policy is Ω(|S||A|H/ ε2) over worst case instances of an MDP with state space S, action space A, and horizon H. We consider a class of MDPs for which the associated optimal Q* function is low rank, where the latent features are unknown. While one would hope to achieve linear sample complexity in |S| and |A| due to the low rank structure, we show that without imposing further assumptions beyond low rank of Q*, if one is constrained to estimate the Q function using only observations from a subset of entries, there is a worst case instance in which one must incur a sample complexity exponential in the horizon H to learn a near optimal policy. We subsequently show that under stronger low rank structural assumptions, given access to a generative model, Low Rank Monte Carlo Policy Iteration (LR-MCPI) and Low Rank Empirical Value Iteration (LR-EVI) achieve the desired sample complexity of Õ((|S|+|A|)poly (d,H)/ε2) for a rank d setting, which is minimax optimal with respect to the scaling of |S|, |A|, and ε. In contrast to literature on linear and low-rank MDPs, we do not require a known feature mapping, our algorithm is computationally simple, and our results hold for long time horizons. Our results provide insights on the minimal low-rank structural assumptions required on the MDP with respect to the transition kernel versus the optimal action-value function. 
    more » « less
  2. null (Ed.)
    The paper develops distributed control techniques to obtain grid services from flexible loads. The Individual Perspective Design (IPD) for local (load level) control is extended to piecewise deterministic and diffusion models for thermostatically controlled load models.The IPD design is formulated as an infinite horizon average reward optimal control problem, in which the reward function contains a term that uses relative entropy rate to model deviation from nominal dynamics. In the piecewise deterministic model, the optimal solution is obtained via the solution to an eigenfunction problem, similar to what is obtained in prior work. For a jump diffusion model this simple structure is absent. The structure for the optimal solution is obtained, which suggests an ODE technique for computation that is likely far more efficient than policy- or value-iteration. 
    more » « less
  3. We consider a multi-stage inventory system with stochastic demand and processing capacity constraints at each stage, for both finite-horizon and infinite-horizon, discounted-cost settings. For a class of such systems characterized by having the smallest capacity at the most downstream stage and system utilization above a certain threshold, we identify the structure of the optimal policy, which represents a novel variation of the order-up-to policy. We find the explicit functional form of the optimal order-up-to levels, and show that they depend (only) on upstream echelon inventories. We establish that, above the threshold utilization, this optimal policy achieves the decomposition of the multidimensional objective cost function for the system into a sum of single-dimensional convex functions. This decomposition eliminates the curse of dimensionality and allows us to numerically solve the problem. We provide a fast algorithm to determine a (tight) upper bound on this threshold utilization for capacity-constrained inventory problems with an arbitrary number of stages. We make use of this algorithm to quantify upper bounds on the threshold utilization for three-, four-, and five-stage capacitated systems over a range of model parameters, and discuss insights that emerge. 
    more » « less
  4. We consider the problem of finite-horizon optimal control of a discrete linear time-varying system subject to a stochastic disturbance and fully observable state. The initial state of the system is drawn from a known Gaussian distribution, and the final state distribution is required to reach a given target Gaussian distribution, while minimizing the expected value of the control effort. We derive the linear optimal control policy by first presenting an efficient solution for the diffusion-less case, and we then solve the case with diffusion by reformulating the system as a superposition of diffusion-less systems. We show that the resulting solution coincides with a LQG problem with particular terminal cost weight matrix. 
    more » « less
  5. In this paper, we study the event-triggered robust stabilization problem of nonlinear systems subject to mismatched perturbations and input constraints. First, with the introduction of an infinite-horizon cost function for the auxiliary system, we transform the robust stabilization problem into a constrained optimal control problem. Then, we prove that the solution of the event-triggered Hamilton–Jacobi–Bellman (ETHJB) equation, which arises in the constrained optimal control problem, guarantees original system states to be uniformly ultimately bounded (UUB). To solve the ETHJB equation, we present a single network adaptive critic design (SN-ACD). The critic network used in the SN-ACD is tuned through the gradient descent method. By using Lyapunov method, we demonstrate that all the signals in the closed-loop auxiliary system are UUB. Finally, we provide two examples, including the pendulum system, to validate the proposed event-triggered control strategy. 
    more » « less