skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: On Robust Control of Partially Observed Uncertain Systems with Additive Costs
In this paper, we consider the problem of optimizing the worst-case behavior of a partially observed system. All uncontrolled disturbances are modeled as finite-valued uncertain variables. Using the theory of cost distributions, we present a dynamic programming (DP) approach to compute a control strategy that minimizes the maximum possible total cost over a given time horizon. To improve the computational efficiency of the optimal DP, we introduce a general definition for information states and show that many information states constructed in previous research efforts are special cases of ours. Additionally, we define approximate information states and an approximate DP that can further improve computational tractability by conceding a bounded performance loss. We illustrate the utility of these results using a numerical example.  more » « less
Award ID(s):
2149520 2219761
NSF-PAR ID:
10421259
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2023 American Control Conference
Page Range / eLocation ID:
4639-4644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Weller, Adrian (Ed.)
    Differential privacy (DP) offers strong theoretical privacy guarantees, though implementations of DP mechanisms may be vulnerable to side-channel attacks, such as timing attacks. When sampling methods such as MCMC or rejection sampling are used to implement a mechanism, the runtime can leak private information. We characterize the additional privacy cost due to the runtime of a rejection sampler in terms of both (epsilon,delta)-DP as well as f-DP. We also show that unless the acceptance probability is constant across databases, the runtime of a rejection sampler does not satisfy epsilon-DP for any epsilon. We show that there is a similar breakdown in privacy with adaptive rejection samplers. We propose three modifications to the rejection sampling algorithm, with varying assumptions, to protect against timing attacks by making the runtime independent of the data. The modification with the weakest assumptions is an approximate sampler, introducing a small increase in the privacy cost, whereas the other modifications give perfect samplers. We also use our techniques to develop an adaptive rejection sampler for log-Holder densities, which also has data-independent runtime. We give several examples of DP mechanisms that fit the assumptions of our methods and can thus be implemented using our samplers. 
    more » « less
  2. Weller, Adrian (Ed.)
    While differential privacy (DP) offers strong theoretical privacy guarantees, implementations of DP mechanisms may be vulnerable to side-channel attacks, such as timing attacks. When sampling methods such as MCMC or rejection sampling are used to implement a privacy mechanism, the runtime can leak private information. We characterize the additional privacy cost due to the runtime of a rejection sampler in terms of both (, δ)-DP as well as f -DP. We also show that unless the acceptance probability is constant across databases, the runtime of a rejection sampler does not satisfy -DP for any . We show that there is a similar breakdown in privacy with adaptive rejection samplers. We propose three modifications to the rejection sampling algorithm, with varying assumptions, to protect against timing attacks by making the runtime independent of the data. The modification with the weakest assumptions is an approximate sampler, introducing a small increase in the privacy cost, whereas the other modifications give perfect samplers. We also use our techniques to develop an adaptive rejection sampler for log-H ̈older densities, which also has data-independent runtime. We give several examples of DP mechanisms that fit the assumptions of our methods and can thus be implemented using our samplers. 
    more » « less
  3. Due to information asymmetry, finding optimal policies for Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) is hard with the complexity growing doubly exponentially in the horizon length. The challenge increases greatly in the multi-agent reinforcement learning (MARL) setting where the transition probabilities, observation kernel, and reward function are unknown. Here, we develop a general compression framework with approximate common and private state representations, based on which decentralized policies can be constructed. We derive the optimality gap of executing dynamic programming (DP) with the approximate states in terms of the approximation error parameters and the remaining time steps. When the compression is exact (no error), the resulting DP is equivalent to the one in existing work. Our general framework generalizes a number of methods proposed in the literature. The results shed light on designing practically useful deep-MARL network structures under the "centralized learning distributed execution" scheme. 
    more » « less
  4. Marc'Aurelio Ranzato, Alina Beygelzimer (Ed.)
    Implementations of the exponential mechanism in differential privacy often require sampling from intractable distributions. When approximate procedures like Markov chain Monte Carlo (MCMC) are used, the end result incurs costs to both privacy and accuracy. Existing work has examined these effects asymptotically, but implementable finite sample results are needed in practice so that users can specify privacy budgets in advance and implement samplers with exact privacy guarantees. In this paper, we use tools from ergodic theory and perfect simulation to design exact finite runtime sampling algorithms for the exponential mechanism by introducing an intermediate modified target distribution using artificial atoms. We propose an additional modification of this sampling algorithm that maintains its ǫ-DP guarantee and has improved runtime at the cost of some utility. We then compare these methods in scenarios where we can explicitly calculate a δ cost (as in (ǫ, δ)-DP) incurred when using standard MCMC techniques. Much as there is a well known trade-off between privacy and utility, we demonstrate that there is also a trade-off between privacy guarantees and runtime. 
    more » « less
  5. Given a spatial network and a set of service centers from k different resource types, a Multiple Resource Network Voronoi Diagram (MRNVD) partitions the spatial network into a set of Service Areas that can minimize the total cycle-distances of graph-nodes to allotted k service centers with different resource types. The MRNVD problem is important for critical societal applications such as assigning essential survival supplies (e.g., food, water, gas, and medical assistance) to residents impacted by man-made or natural disasters. The MRNVD problem is NP-hard; it is computationally challenging due to the large size of the transportation network. Previous work proposed the Distance bounded Pruning (DP) approach to produce an optimal solution for MRNVD. However, we found that DP can be generalized to reduce the computational cost for the minimum cycle-distance. In this paper, we extend our prior work and propose a novel approach that reduces the computational cost. Experiments using real-world datasets from five different regions demonstrate that the proposed approach creates MRNVD and significantly reduces the computational cost. 
    more » « less