skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Variational Inequality Model for Learning Neural Networks
Neural networks have become ubiquitous tools for solving signal and image processing problems, and they often outperform standard approaches. Nevertheless, training the layers of a neural network is a challenging task in many applications. The prevalent training procedure consists of minimizing highly non-convex objectives based on data sets of huge dimension. In this context, current methodologies are not guaranteed to produce global solutions. We present an alternative approach which foregoes the optimization framework and adopts a variational inequality formalism. The associated algorithm guarantees convergence of the iterates to a true solution of the variational inequality and it possesses an efficient block-iterative structure. A numerical application is presented.  more » « less
Award ID(s):
2211123
PAR ID:
10414029
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE International Conference on Acoustics, Speech, and Signal Processing
Volume:
2023
Page Range / eLocation ID:
1 to 5
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The use of min-max optimization in the adversarial training of deep neural network classifiers, and the training of generative adversarial networks has motivated the study of nonconvex-nonconcave optimization objectives, which frequently arise in these applications. Unfortunately, recent results have established that even approximate first-order stationary points of such objectives are intractable, even under smoothness conditions, motivating the study of min-max objectives with additional structure. We introduce a new class of structured nonconvex-nonconcave min-max optimization problems, proposing a generalization of the extragradient algorithm which provably converges to a stationary point. The algorithm applies not only to Euclidean spaces, but also to general ℓ𝑝-normed finite-dimensional real vector spaces. We also discuss its stability under stochastic oracles and provide bounds on its sample complexity. Our iteration complexity and sample complexity bounds either match or improve the best known bounds for the same or less general nonconvex-nonconcave settings, such as those that satisfy variational coherence or in which a weak solution to the associated variational inequality problem is assumed to exist. 
    more » « less
  2. Motivated by structures that appear in deep neural networks, we investigate nonlinear com- posite models alternating proximity and affine operators defined on different spaces. We first show that a wide range of activation operators used in neural networks are actually proximity operators. We then establish conditions for the averagedness of the proposed composite constructs and investigate their asymptotic properties. It is shown that the limit of the resulting process solves a variational inequality which, in general, does not derive from a minimization problem. The analysis relies on tools from monotone operator theory and sheds some light on a class of neural networks structures with so far elusive asymptotic properties. 
    more » « less
  3. null (Ed.)
    Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly. 
    more » « less
  4. Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly. 
    more » « less
  5. We consider various versions of the obstacle and thin-obstacle problems, we interpret them as variational inequalities, with non-smooth constraint, and prove that they satisfy a new constrained Łojasiewicz inequality. The difficulty lies in the fact that, since the constraint is non-analytic, the pioneering method of L. Simon ([]) does not apply and we have to exploit a better understanding on the constraint itself. We then apply this inequality to two associated problems. First we combine it with an abstract result on parabolic variational inequalities, to prove the convergence at infinity of the strong global solutions to the parabolic obstacle and thin-obstacle problems to a unique stationary solution with a rate. Secondly, we give an abstract proof, based on a parabolic approach, of the epiperimetric inequality, which we then apply to the singular points of the obstacle and thin-obstacle problems. 
    more » « less