skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On Team Decision Problems with Nonclassical Information Structures
In this article, we consider sequential dynamic team decision problems with nonclassical information structures. First, we address the problem from the point of view of a “manager” who seeks to derive the optimal strategy of the team in a centralized process. We derive structural results that yield an information state for the team, which does not depend on the control strategy, and thus, it can lead to a dynamic programming decomposition where the optimization problem is over the space of the team’s decisions. We, then, derive structural results for each team member that yield an information state which does not depend on their control strategy, and thus, it can lead to a dynamic programming decomposition where the optimization problem for each team member is over the space of their decisions. Finally, we show that the solution of each team member is the same as the one derived by the manager. We present an illustrative example of a dynamic team with a delayed sharing information structure.  more » « less
Award ID(s):
2401007 2348381
PAR ID:
10508511
Author(s) / Creator(s):
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Automatic Control
Volume:
68
Issue:
7
ISSN:
0018-9286
Page Range / eLocation ID:
3915–3930
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Simultaneous evaluation of multiple time scale decisions has been regarded as a promising avenue to increase the process efficiency and profitability through leveraging their synergistic interactions. Feasibility of such an integral approach is essential to establish a guarantee for operability of the derived decisions. In this study, we present a modeling methodology to integrate process design, scheduling, and advanced control decisions with a single mixed‐integer dynamic optimization (MIDO) formulation while providing certificates of operability for the closed‐loop implementation. We use multi‐parametric programming to derive explicit expressions for the model predictive control strategy, which is embedded into the MIDO using the base‐2 numeral system that enhances the computational tractability of the integrated problem by exponentially reducing the required number of binary variables. Moreover, we apply the State Equipment Network representation within the MIDO to systematically evaluate the scheduling decisions. The proposed framework is illustrated with two batch processes with different complexities. 
    more » « less
  2. Most cyber–physical systems (CPS) encounter a large volume of data which is added to the system gradually in real time and not altogether in advance. In this paper, we provide a theoretical framework that yields optimal control strategies for such CPS at the intersection of control theory and learning. In the proposed framework, we use the actual CPS, i.e., the ‘‘true" system that we seek to optimally control online, in parallel with a model of the CPS that is available. We then institute an information state for the system which does not depend on the control strategy. An important consequence of this independence is that for any given choice of a control strategy and a realization of the system’s variables until time t, the information states at future times do not depend on the choice of the control strategy at time t but only on the realization of the decision at time t, and thus they are related to the concept of separation between estimation of the state and control. Namely, the future information states are separated from the choice of the current control strategy. Such control strategies are called separated control strategies. Hence, we can derive offline the optimal control strategy of the system with respect to the information state, which might not be precisely known due to model uncertainties or complexity of the system, and then use standard learning approaches to learn the information state online while data are added gradually to the system in real time. We show that after the information state becomes known, the separated control strategy of the CPS model derived offline is optimal for the actual system. We illustrate the proposed framework in a dynamic system consisting of two subsystems with a delayed sharing information structure. 
    more » « less
  3. Teams can be often viewed as a dynamic system where the team configuration evolves over time (e.g., new members join the team; existing members leave the team; the skills of the members improve over time). Consequently, the performance of the team might be changing due to such team dynamics. A natural question is how to plan the (re-)staffing actions (e.g., recruiting a new team member) at each time step so as to maximize the expected cumulative performance of the team. In this paper, we address the problem of real-time team optimization by intelligently selecting the best candidates towards increasing the similarity between the current team and the high-performance teams according to the team configuration at each time-step. The key idea is to formulate it as a Markov Decision process (MDP) problem and leverage recent advances in reinforcement learning to optimize the team dynamically. The proposed method bears two main advantages, including (1) dynamics, being able to model the dynamics of the team to optimize the initial team towards the direction of a high-performance team via performance feedback; (2) efficacy, being able to handle the large state/action space via deep reinforcement learning based value estimation. We demonstrate the effectiveness of the proposed method through extensive empirical evaluations. 
    more » « less
  4. ABSTRACT We examine a dynamic disclosure model in which the value of a firm follows a random walk. Every period, with some probability, the manager learns the firm's value and decides whether to disclose it. The manager maximizes the market perception of the firm's value, which is based on disclosed information. In equilibrium, the manager follows a threshold strategy with thresholds below current prices. He sometimes reveals pessimistic information that reduces the market perception of the firm's value. He does so to reduce future market uncertainty, which is valuable even under risk‐neutrality. 
    more » « less
  5. Approximating the Koopman operator from data is numerically challenging when many lifting functions are considered. Even low-dimensional systems can yield unstable or ill-conditioned results in a high-dimensional lifted space. In this paper, Extended Dynamic Mode Decomposition (DMD) and DMD with control, two methods for approximating the Koopman operator, are reformulated as convex optimization problems with linear matrix inequality constraints. Asymptotic stability constraints and system norm regularizers are then incorporated as methods to improve the numerical conditioning of the Koopman operator. Specifically, the H ∞   norm is used to penalize the input–output gain of the Koopman system. Weighting functions are then applied to penalize the system gain at specific frequencies. These constraints and regularizers introduce bilinear matrix inequality constraints to the regression problem, which are handled by solving a sequence of convex optimization problems. Experimental results using data from an aircraft fatigue structural test rig and a soft robot arm highlight the advantages of the proposed regression methods. 
    more » « less