skip to main content


Title: Learning Linear Complementarity Systems
This paper investigates the learning, or system identification, of a class of piecewise-affine dynamical systems known as linear complementarity systems (LCSs). We propose a violation-based loss which enables efficient learning of the LCS parameterization, without prior knowledge of the hybrid mode boundaries, using gradient-based methods. The proposed violation-based loss incorporates both dynamics prediction loss and a novel complementarity - violation loss. We show several properties attained by this loss formulation, including its differentiability, the efficient computation of first- and second-order derivatives, and its relationship to the traditional prediction loss, which strictly enforces complementarity. We apply this violation-based loss formulation to learn LCSs with tens of thousands of (potentially stiff) hybrid modes. The results demonstrate a state-of-the-art ability to identify piecewise-affine dynamics, outperforming methods which must differentiate through non-smooth linear complementarity problems.  more » « less
Award ID(s):
1830218
NSF-PAR ID:
10335683
Author(s) / Creator(s):
; ; ;
Editor(s):
Firoozi, Roya; Mehr, Negar; Yel, Esen; Antonova, Rika; Bohg, Jeannette; Schwager, Mac; Kochenderfer, Mykel
Date Published:
Journal Name:
Learning for Dynamics and Control Conference
Volume:
168
Page Range / eLocation ID:
1137-1149
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Simulating stiff materials in applications where deformations are either not significant or else can safely be ignored is a fundamental task across fields. Rigid body modeling has thus long remained a critical tool and is, by far, the most popular simulation strategy currently employed for modeling stiff solids. At the same time, rigid body methods continue to pose a number of well known challenges and trade-offs including intersections, instabilities, inaccuracies, and/or slow performances that grow with contact-problem complexity. In this paper we revisit the stiff body problem and present ABD, a simple and highly effective affine body dynamics framework, which significantly improves state-of-the-art for simulating stiff-body dynamics. We trace the challenges in rigid-body methods to the necessity of linearizing piecewise-rigid trajectories and subsequent constraints. ABD instead relaxes the unnecessary (and unrealistic) constraint that each body's motion be exactly rigid with a stiff orthogonality potential, while preserving the rigid body model's key feature of a small coordinate representation. In doing so ABD replaces piecewise linearization with piecewise linear trajectories. This, in turn, combines the best of both worlds: compact coordinates ensure small, sparse system solves, while piecewise-linear trajectories enable efficient and accurate constraint (contact and joint) evaluations. Beginning with this simple foundation, ABD preserves all guarantees of the underlying IPC model we build it upon, e.g., solution convergence, guaranteed non-intersection, and accurate frictional contact. Over a wide range and scale of simulation problems we demonstrate that ABD brings orders of magnitude performance gains (two- to three-orders on the CPU and an order more when utilizing the GPU, obtaining 10, 000× speedups) over prior IPC-based methods, while maintaining simulation quality and nonintersection of trajectories. At the same time ABD has comparable or faster timings when compared to state-of-the-art rigid body libraries optimized for performance without guarantees, and successfully and efficiently solves challenging simulation problems where both classes of prior rigid body simulation methods fail altogether. 
    more » « less
  2. A general formulation of piecewise linear systems with discontinuous force elements is provided in this paper. It has been demonstrated that this class of nonlinear systems is of great importance due to their ability to accurately model numerous scientific and engineering phenomena. Additionally, it is shown that this class of nonlinear systems can demonstrate a wide spectrum of nonlinear motions and in fact, the phenomenon of weak chaos is observed in a mechanical assembly for the first time.

    Despite such importance, efficient methods for fast and accurate evaluation of piecewise linear systems’ responses are lacking and the methods of the literature are either incompatible, very slow, very inaccurate, or bear a combination of the aforementioned deficiencies. To overcome this shortcoming, a novel symbolic-numeric method is presented in this paper that is able to obtain the analytical response of piecewise linear systems with discontinuous elements in an efficient manner. Contrary to other efficient methods that are based on stationary steady state dynamics, this method will not experience failure upon the occurrence of complex motion and is able to capture the entirety of the dynamics. 

    more » « less
  3. The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless. First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of T T , the time-interval over which training data is specified. Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features. Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain- interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models. 
    more » « less
  4. Abstract We develop new theoretical results on matrix perturbation to shed light on the impact of architecture on the performance of a deep network. In particular, we explain analytically what deep learning practitioners have long observed empirically: the parameters of some deep architectures (e.g., residual networks, ResNets, and Dense networks, DenseNets) are easier to optimize than others (e.g., convolutional networks, ConvNets). Building on our earlier work connecting deep networks with continuous piecewise-affine splines, we develop an exact local linear representation of a deep network layer for a family of modern deep networks that includes ConvNets at one end of a spectrum and ResNets, DenseNets, and other networks with skip connections at the other. For regression and classification tasks that optimize the squared-error loss, we show that the optimization loss surface of a modern deep network is piecewise quadratic in the parameters, with local shape governed by the singular values of a matrix that is a function of the local linear representation. We develop new perturbation results for how the singular values of matrices of this sort behave as we add a fraction of the identity and multiply by certain diagonal matrices. A direct application of our perturbation results explains analytically why a network with skip connections (such as a ResNet or DenseNet) is easier to optimize than a ConvNet: thanks to its more stable singular values and smaller condition number, the local loss surface of such a network is less erratic, less eccentric, and features local minima that are more accommodating to gradient-based optimization. Our results also shed new light on the impact of different nonlinear activation functions on a deep network’s singular values, regardless of its architecture. 
    more » « less
  5. We study the fair division problem of allocating a mixed manna under additively separable piecewise linear concave (SPLC) utilities. A mixed manna contains goods that everyone likes and bads (chores) that everyone dislikes as well as items that some like and others dislike. The seminal work of Bogomolnaia et al. argues why allocating a mixed manna is genuinely more complicated than a good or a bad manna and why competitive equilibrium is the best mechanism. It also provides the existence of equilibrium and establishes its distinctive properties (e.g., nonconvex and disconnected set of equilibria even under linear utilities) but leaves the problem of computing an equilibrium open. Our main results are a linear complementarity problem formulation that captures all competitive equilibria of a mixed manna under SPLC utilities (a strict generalization of linear) and a complementary pivot algorithm based on Lemke’s scheme for finding one. Experimental results on randomly generated instances suggest that our algorithm is fast in practice. Given the [Formula: see text]-hardness of the problem, designing such an algorithm is the only non–brute force (nonenumerative) option known; for example, the classic Lemke–Howson algorithm for computing a Nash equilibrium in a two-player game is still one of the most widely used algorithms in practice. Our algorithm also yields several new structural properties as simple corollaries. We obtain a (constructive) proof of existence for a far more general setting, membership of the problem in [Formula: see text], a rational-valued solution, and an odd number of solutions property. The last property also settles the conjecture of Bogomolnaia et al. in the affirmative. Furthermore, we show that, if the number of either agents or items is a constant, then the number of pivots in our algorithm is strongly polynomial when the mixed manna contains all bads. 
    more » « less