skip to main content


Search for: All records

Award ID contains: 1646556

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    Multi-robot cooperation requires agents to make decisions that are consistent with the shared goal without disregarding action-specific preferences that might arise from asymmetry in capabilities and individual objectives. To accomplish this goal, we propose a method named SLiCC: Stackelberg Learning in Cooperative Control. SLiCC models the problem as a partially observable stochastic game composed of Stackelberg bimatrix games, and uses deep reinforcement learning to obtain the payoff matrices associated with these games. Appropriate cooperative actions are then selected with the derived Stackelberg equilibria. Using a bi-robot cooperative object transportation problem, we validate the performance of SLiCC against centralized multi-agent Q-learning and demonstrate that SLiCC achieves better combined utility. 
    more » « less
  2. null (Ed.)
    We frame the collision avoidance problem of multi-agent autonomous vehicle systems into an online convex optimization problem of minimizing certain aggregate cost over the time horizon. We then propose a distributed real-time collision avoidance algorithm based on the online gradient algorithm for solving the resulting online convex optimization problem. We characterize the performance of the algorithm with respect to a static offline optimization, and show that, by choosing proper stepsizes, the upper bound on the performance gap scales sublinearly in time. The numerical experiment shows that the proposed algorithm can achieve better collision avoidance performance than the existing Optimal Reciprocal Collision Avoidance (ORCA) algorithm, due to less aggressive velocity updates that can better prevent the collision in the long run. 
    more » « less
  3. Modern cyber-physical systems (CPS) are often developed in a model-based development (MBD) paradigm. The MBD paradigm involves the construction of different kinds of models: (1) a plant model that encapsulates the physical components of the system (e.g., mechanical, electrical, chemical components) using representations based on differential and algebraic equations, (2) a controller model that encapsulates the embedded software components of the system, and (3) an environment model that encapsulates physical assumptions on the external environment of the CPS application. In order to reason about the correctness of CPS applications, we typically pose the following question: For all possible environment scenarios, does the closed-loop system consisting of the plant and the controller exhibit the desired behavior? Typically, the desired behavior is expressed in terms of properties that specify unsafe behaviors of the closed-loop system. Often, such behaviors are expressed using variants of real-time temporal logics. In this chapter, we will examine formal methods based on bounded-time reachability analysis, simulation-guided reachability analysis, deductive techniques based on safety invariants, and formal, requirement-driven testing techniques. We will review key results in the literature, and discuss the scalability and applicability of such systems to various academic and industrial contexts. We conclude this chapter by discussing the challenge to formal verification and testing techniques posed by newer CPS applications that use AI-based software components. 
    more » « less
  4. In this paper, we provide an approach to data-driven control for artificial pancreas systems by learning neural network models of human insulin-glucose physiology from available patient data and using a mixed integer optimization approach to control blood glucose levels in real-time using the inferred models. First, our approach learns neural networks to predict the future blood glucose values from given data on insulin infusion and their resulting effects on blood glucose levels. However, to provide guarantees on the resulting model, we use quantile regression to fit multiple neural networks that predict upper and lower quantiles of the future blood glucose levels, in addition to the mean. Using the inferred set of neural networks, we formulate a model-predictive control scheme that adjusts both basal and bolus insulin delivery to ensure that the risk of harmful hypoglycemia and hyperglycemia are bounded using the quantile models while the mean prediction stays as close as possible to the desired target. We discuss how this scheme can handle disturbances from large unannounced meals as well as infeasibilities that result from situations where the uncertainties in future glucose predictions are too high. We experimentally evaluate this approach on data obtained from a set of 17 patients over a course of 40 nights per patient. Furthermore, we also test our approach using neural networks obtained from virtual patient models available through the UVA-Padova simulator for type-1 diabetes. 
    more » « less