skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Implicit Graph Neural Networks: A Monotone Operator Viewpoint
Implicit graph neural networks (IGNNs) – that solve a fixed-point equilibrium equation using Picard iteration for representation learning – have shown remarkable performance in learning longrange dependencies (LRD) in the underlying graphs. However, IGNNs suffer from several issues, including 1) their expressivity is limited by their parameterizations for the well-posedness guarantee, 2) IGNNs are unstable in learning LRD, and 3) IGNNs become computationally inefficient when learning LRD. In this paper, we provide a new well-posedness characterization for IGNNs leveraging monotone operator theory, resulting in a much more expressive parameterization than the existing one. We also propose an orthogonal parameterization for IGNN based on Cayley transform to stabilize learning LRD. Furthermore, we leverage Andersonaccelerated operator splitting schemes to efficiently solve for the fixed point of the equilibrium equation of IGNN with monotone or orthogonal parameterization. We verify the computational efficiency and accuracy of the new models over existing IGNNs on various graph learning tasks at both graph and node levels. Code is available at https://github.com/ Utah-Math-Data-Science/MIGNN  more » « less
Award ID(s):
1952339
PAR ID:
10529108
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Proceedings of Machine Learning Research
Date Published:
Volume:
202
Page Range / eLocation ID:
1521-1548
Format(s):
Medium: X
Location:
https://proceedings.mlr.press/v202/baker23a.html
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper we show global well-posedness near vacuum for the binary–ternary Boltzmann equation. The binary–ternary Boltzmann equation provides a correction term to the classical Boltzmann equation, taking into account both binary and ternary interactions of particles, and may serve as a more accurate description model for denser gases in non-equilibrium. Well-posedness of the classical Boltzmann equation and, independently, the purely ternary Boltzmann equation follow as special cases. To prove global well-posedness, we use a Kaniel–Shinbrot iteration and related work to approximate the solution of the non-linear equation by monotone sequences of super- solutions and subsolutions. This analysis required establishing new convolution-type estimates to control the contribution of the ternary collisional operator to the model. We show that the ternary operator allows consideration of softer potentials than the one binary operator, and consequently our solution to the ternary correction of the Boltzmann equation preserves all the properties of the binary interactions solution. These results are novel for collisional operators of monoatomic gases with either hard or soft potentials that model both binary and ternary interactions. 
    more » « less
  2. The response of a body described by a quasi-linear viscoelastic constitutive relation, whose material moduli depend on the mechanical pressure (that is one-third the trace of stress) is studied. The constitutive relation stems from a class of implicit relations between the histories of the stress and the relative deformation gradient. A-priori thresholding is enforced through the pressure that ensures that the displacement gradient remains small. The resulting mixed variational problem consists of an evolutionary equation with the Volterra convolution operator; this equation is studied for well-posedness within the theory of maximal monotone graphs. For isotropic extension or compression, a semi-analytic solution of the quasi-linear viscoelastic problem is constructed under stress control. The equations are studied numerically with respect to monotone loading both with and without thresholding. In the example, the thresholding procedure ensures that the solution does not blow-up in finite time. 
    more » « less
  3. Buttazzo, G.; Casas, E.; de Teresa, L.; Glowinski, R.; Leugering, G.; Trélat, E.; Zhang, X. (Ed.)
    An optimal control problem is considered for a stochastic differential equation with the cost functional determined by a backward stochastic Volterra integral equation (BSVIE, for short). This kind of cost functional can cover the general discounting (including exponential and non-exponential) situations with a recursive feature. It is known that such a problem is time-inconsistent in general. Therefore, instead of finding a global optimal control, we look for a time-consistent locally near optimal equilibrium strategy. With the idea of multi-person differential games, a family of approximate equilibrium strategies is constructed associated with partitions of the time intervals. By sending the mesh size of the time interval partition to zero, an equilibrium Hamilton–Jacobi–Bellman (HJB, for short) equation is derived, through which the equilibrium value function and an equilibrium strategy are obtained. Under certain conditions, a verification theorem is proved and the well-posedness of the equilibrium HJB is established. As a sort of Feynman–Kac formula for the equilibrium HJB equation, a new class of BSVIEs (containing the diagonal value Z ( r , r ) of Z (⋅ , ⋅)) is naturally introduced and the well-posedness of such kind of equations is briefly presented. 
    more » « less
  4. Implicit neural networks are a general class of learning models that replace the layers in traditional feedforward models with implicit algebraic equations. Compared to traditional learning models, implicit networks offer competitive performance and reduced memory consumption. However, they can remain brittle with respect to input adversarial perturbations. This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks; our framework blends together mixed monotone systems theory and contraction theory. First, given an implicit neural network, we introduce a related embedded network and show that, given an infinity-norm box constraint on the input, the embedded network provides an infinity-norm box overapproximation for the output of the original network. Second, using infinity-matrix measures, we propose sufficient conditions for well-posedness of both the original and embedded system and design an iterative algorithm to compute the infinity-norm box robustness margins for reachability and classification problems. Third, of independent value, we show that employing a suitable relative classifier variable in our analysis will lead to tighter bounds on the certified adversarial robustness in classification problems. Finally, we perform numerical simulations on a Non-Euclidean Monotone Operator Network (NEMON) trained on the MNIST dataset. In these simulations, we compare the accuracy and run time of our mixed monotone contractive approach with the existing robustness verification approaches in the literature for estimating the certified adversarial robustness. 
    more » « less
  5. null (Ed.)
    The goal of this paper is to promote the use of fixed point strategies in data science by showing that they provide a simplifying and unifying framework to model, analyze, and solve a great variety of problems. They are seen to constitute a natural environment to explain the behavior of advanced convex optimization methods as well as of recent nonlinear methods in data science which are formulated in terms of paradigms that go beyond minimization concepts and involve constructs such as Nash equilibria or monotone inclusions. We review the pertinent tools of fixed point theory and describe the main state-of-the-art algorithms for provenly convergent fixed point construction. We also incorporate additional ingredients such as stochasticity, block-implementations, and non-Euclidean metrics, which provide further enhancements. Applications to signal and image processing, machine learning, statistics, neural networks, and inverse problems are discussed. 
    more » « less