skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Event representation with sequential, semi-supervised discrete variables
Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly.  more » « less
Award ID(s):
1940931
PAR ID:
10324924
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Annual Conference of the North American Chapter of the Association for Computational Linguistics
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly. 
    more » « less
  2. null (Ed.)
    We show how to learn a neural topic model with discrete random variables—one that explicitly models each word’s assigned topic—using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences of text with the topic model’s ability to capture global, thematic coherence. Using neural variational inference, we show improved perplexity and document understanding across multiple corpora. We examine the effect of prior parameters both on the model and variational parameters, and demonstrate how our approach can compete and surpass a popular topic model implementation on an automatic measure of topic quality. 
    more » « less
  3. Bouamor, Houda; Pino, Juan; Bali, Kalia (Ed.)
    This paper presents FlowSUM, a normalizing flows-based variational encoder-decoder framework for Transformer-based summarization. Our approach tackles two primary challenges in variational summarization: insufficient semantic information in latent representations and posterior collapse during training. To address these challenges, we employ normalizing flows to enable flexible latent posterior modeling, and we propose a controlled alternate aggressive training (CAAT) strategy with an improved gate mechanism. Experimental results show that FlowSUM significantly enhances the quality of generated summaries and unleashes the potential for knowledge distillation with minimal impact on inference time. Furthermore, we investigate the issue of posterior collapse in normalizing flows and analyze how the summary quality is affected by the training strategy, gate initialization, and the type and number of normalizing flows used, offering valuable insights for future research. 
    more » « less
  4. A key aspect of the neural coding problem is understanding how representations of afferent stimuli are built through the dynamics of learning and adaptation within neural networks. The infomax paradigm is built on the premise that such learning attempts to maximize the mutual information between input stimuli and neural activities. In this letter, we tackle the problem of such information-based neural coding with an eye toward two conceptual hurdles. Specifically, we examine and then show how this form of coding can be achieved with online input processing. Our framework thus obviates the biological incompatibility of optimization methods that rely on global network awareness and batch processing of sensory signals. Central to our result is the use of variational bounds as a surrogate objective function, an established technique that has not previously been shown to yield online policies. We obtain learning dynamics for both linear-continuous and discrete spiking neural encoding models under the umbrella of linear gaussian decoders. This result is enabled by approximating certain information quantities in terms of neuronal activity via pairwise feedback mechanisms. Furthermore, we tackle the problem of how such learning dynamics can be realized with strict energetic constraints. We show that endowing networks with auxiliary variables that evolve on a slower timescale can allow for the realization of saddle-point optimization within the neural dynamics, leading to neural codes with favorable properties in terms of both information and energy. 
    more » « less
  5. Meila, Marina; Zhang, Tong (Ed.)
    Black-box variational inference algorithms use stochastic sampling to analyze diverse statistical models, like those expressed in probabilistic programming languages, without model-specific derivations. While the popular score-function estimator computes unbiased gradient estimates, its variance is often unacceptably large, especially in models with discrete latent variables. We propose a stochastic natural gradient estimator that is as broadly applicable and unbiased, but improves efficiency by exploiting the curvature of the variational bound, and provably reduces variance by marginalizing discrete latent variables. Our marginalized stochastic natural gradients have intriguing connections to classic coordinate ascent variational inference, but allow parallel updates of variational parameters, and provide superior convergence guarantees relative to naive Monte Carlo approximations. We integrate our method with the probabilistic programming language Pyro and evaluate real-world models of documents, images, networks, and crowd-sourcing. Compared to score-function estimators, we require far fewer Monte Carlo samples and consistently convergence orders of magnitude faster. 
    more » « less