skip to main content


Title: On Team Decision Problems with Nonclassical Information Structures
In this article, we consider sequential dynamic team decision problems with nonclassical information structures. First, we address the problem from the point of view of a “manager” who seeks to derive the optimal strategy of the team in a centralized process. We derive structural results that yield an information state for the team, which does not depend on the control strategy, and thus, it can lead to a dynamic programming decomposition where the optimization problem is over the space of the team’s decisions. We, then, derive structural results for each team member that yield an information state which does not depend on their control strategy, and thus, it can lead to a dynamic programming decomposition where the optimization problem for each team member is over the space of their decisions. Finally, we show that the solution of each team member is the same as the one derived by the manager. We present an illustrative example of a dynamic team with a delayed sharing information structure.  more » « less
Award ID(s):
2401007 2348381
NSF-PAR ID:
10508511
Author(s) / Creator(s):
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Automatic Control
Volume:
68
Issue:
7
ISSN:
0018-9286
Page Range / eLocation ID:
3915–3930
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Simultaneous evaluation of multiple time scale decisions has been regarded as a promising avenue to increase the process efficiency and profitability through leveraging their synergistic interactions. Feasibility of such an integral approach is essential to establish a guarantee for operability of the derived decisions. In this study, we present a modeling methodology to integrate process design, scheduling, and advanced control decisions with a single mixed‐integer dynamic optimization (MIDO) formulation while providing certificates of operability for the closed‐loop implementation. We use multi‐parametric programming to derive explicit expressions for the model predictive control strategy, which is embedded into the MIDO using the base‐2 numeral system that enhances the computational tractability of the integrated problem by exponentially reducing the required number of binary variables. Moreover, we apply the State Equipment Network representation within the MIDO to systematically evaluate the scheduling decisions. The proposed framework is illustrated with two batch processes with different complexities.

     
    more » « less
  2. Most cyber–physical systems (CPS) encounter a large volume of data which is added to the system gradually in real time and not altogether in advance. In this paper, we provide a theoretical framework that yields optimal control strategies for such CPS at the intersection of control theory and learning. In the proposed framework, we use the actual CPS, i.e., the ‘‘true" system that we seek to optimally control online, in parallel with a model of the CPS that is available. We then institute an information state for the system which does not depend on the control strategy. An important consequence of this independence is that for any given choice of a control strategy and a realization of the system’s variables until time t, the information states at future times do not depend on the choice of the control strategy at time t but only on the realization of the decision at time t, and thus they are related to the concept of separation between estimation of the state and control. Namely, the future information states are separated from the choice of the current control strategy. Such control strategies are called separated control strategies. Hence, we can derive offline the optimal control strategy of the system with respect to the information state, which might not be precisely known due to model uncertainties or complexity of the system, and then use standard learning approaches to learn the information state online while data are added gradually to the system in real time. We show that after the information state becomes known, the separated control strategy of the CPS model derived offline is optimal for the actual system. We illustrate the proposed framework in a dynamic system consisting of two subsystems with a delayed sharing information structure. 
    more » « less
  3. Teams can be often viewed as a dynamic system where the team configuration evolves over time (e.g., new members join the team; existing members leave the team; the skills of the members improve over time). Consequently, the performance of the team might be changing due to such team dynamics. A natural question is how to plan the (re-)staffing actions (e.g., recruiting a new team member) at each time step so as to maximize the expected cumulative performance of the team. In this paper, we address the problem of real-time team optimization by intelligently selecting the best candidates towards increasing the similarity between the current team and the high-performance teams according to the team configuration at each time-step. The key idea is to formulate it as a Markov Decision process (MDP) problem and leverage recent advances in reinforcement learning to optimize the team dynamically. The proposed method bears two main advantages, including (1) dynamics, being able to model the dynamics of the team to optimize the initial team towards the direction of a high-performance team via performance feedback; (2) efficacy, being able to handle the large state/action space via deep reinforcement learning based value estimation. We demonstrate the effectiveness of the proposed method through extensive empirical evaluations. 
    more » « less
  4. Approximating the Koopman operator from data is numerically challenging when many lifting functions are considered. Even low-dimensional systems can yield unstable or ill-conditioned results in a high-dimensional lifted space. In this paper, Extended Dynamic Mode Decomposition (DMD) and DMD with control, two methods for approximating the Koopman operator, are reformulated as convex optimization problems with linear matrix inequality constraints. Asymptotic stability constraints and system norm regularizers are then incorporated as methods to improve the numerical conditioning of the Koopman operator. Specifically, the H ∞   norm is used to penalize the input–output gain of the Koopman system. Weighting functions are then applied to penalize the system gain at specific frequencies. These constraints and regularizers introduce bilinear matrix inequality constraints to the regression problem, which are handled by solving a sequence of convex optimization problems. Experimental results using data from an aircraft fatigue structural test rig and a soft robot arm highlight the advantages of the proposed regression methods. 
    more » « less
  5. In this work, we study the optimal design of two-armed clinical trials to maximize the accuracy of parameter estimation in a statistical model, where the interaction between patient covariates and treatment are explicitly incorporated to enable precision medication decisions. Such a modeling extension leads to significant complexities for the produced optimization problems because they include optimization over design and covariates concurrently. We take a min-max optimization model and minimize (over design) the maximum (over population) variance of the estimated interaction effect between treatment and patient covariates. This results in a min-max bilevel mixed integer nonlinear programming problem, which is notably challenging to solve. To address this challenge, we introduce a surrogate optimization model by approximating the objective function, for which we propose two solution approaches. The first approach provides an exact solution based on reformulation and decomposition techniques. In the second approach, we provide a lower bound for the inner optimization problem and solve the outer optimization problem over the lower bound. We test our proposed algorithms with synthetic and real-world data sets and compare them with standard (re)randomization methods. Our numerical analysis suggests that the proposed approaches provide higher-quality solutions in terms of the variance of estimators and probability of correct selection. We also show the value of covariate information in precision medicine clinical trials by comparing our proposed approaches to an alternative optimal design approach that does not consider the interaction terms between covariates and treatment. Summary of Contribution: Precision medicine is the future of healthcare where treatment is prescribed based on each patient information. Designing precision medicine clinical trials, which are the cornerstone of precision medicine, is extremely challenging because sample size is limited and patient information may be multidimensional. This work proposes a novel approach to optimally estimate the treatment effect for each patient type in a two-armed clinical trial by reducing the largest variance of personalized treatment effect. We use several statistical and optimization techniques to produce efficient solution methodologies. Results have the potential to save countless lives by transforming the design and implementation of future clinical trials to ensure the right treatments for the right patients. Doing so will reduce patient risks and reduce costs in the healthcare system. 
    more » « less