Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstructionbased models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detection, trained by maximizing the observed likelihood by posterior sampling and alternating backpropagation. A topdown Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently. Despite relying on posterior sampling, it is computationally more efficient than currentmore »Free, publiclyaccessible full text available January 1, 2023

The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this article, we study this fundamental problem in the framework of energybased model, where the observed expert trajectories are assumed to be random samples from a probability density function defined as the exponential of the negative cost function up to a normalizing constant. The parameters of the cost function are learned by maximum likelihood via an “analysis by synthesis” scheme, which iterates (1) synthesis step: sample the synthesized trajectories from themore »Free, publiclyaccessible full text available January 1, 2023

Outsideknowledge visual question answering (OKVQA) requires the agent to comprehend the image, make use of relevant knowledge from the entire web, and digest all the information to answer the question. Most previous works address the problem by first fusing the image and question in the multimodal space, which is inflexible for further fusion with a vast amount of external knowledge. In this paper, we call for an alternative paradigm for the OKVQA task, which transforms the image into plain text, so that we can enable knowledge passage retrieval, and generative questionanswering in the natural language space. This paradigm takes advantagemore »Free, publiclyaccessible full text available January 1, 2023

This paper proposes a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1). The model couples the following two components: (1) the vector representations of local contents of images and (2) the matrix representations of local pixel displacements caused by the relative motions between the agent and the objects in the 3D scene. When the image frame undergoes changes due to local pixel displacements, the vectors are multiplied by the matrices that represent the localmore »Free, publiclyaccessible full text available January 1, 2023

Is intelligence realized by connectionist or classicist? While connectionist approaches have achieved superhuman performance, there has been growing evidence that such taskspecific superiority is particularly fragile in systematic generalization. This observation lies in the central debate between connectionist and classicist, wherein the latter continually advocates an algebraic treatment in cognitive architectures. In this work, we follow the classicist’s call and propose a hybrid approach to improve systematic generalization in reasoning. Specifically, we showcase a prototype with algebraic representation for the abstract spatialtemporal reasoning task of Raven’s Progressive Matrices (RPM) and present the ALgebraAware NeuroSemiSymbolic (ALANS) learner. The ALANS learner ismore »Free, publiclyaccessible full text available January 1, 2023

The core of selfsupervised learning for pretraining language models includes pretraining task design as well as appropriate data augmentation. Most data augmentations in language model pretraining are contextindependent. A seminal contextualized augmentation was recently proposed in ELECTRA and achieved stateoftheart performance by introducing an auxiliary generation network (generator) to produce contextualized data augmentation for the training of a main discrimination network (discriminator). This design, however, introduces extra computation cost of the genera tor and a need to adjust the relative capability between the generator and the discriminator. In this paper, we propose a selfaugmentation strategy (SAS) where a single networkmore »Free, publiclyaccessible full text available January 1, 2023

Learning energybased model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in highdimensional data space is generally not mixing, because the energy function, which is usually parametrized by deep network, is highly multimodal in the data space. This is a serious handicap for both theory and practice of EBMs. In this paper, we propose to learn EBM with a flowbased model (or in general latent variable model) serving as a backbone, so that the EBM is a correction or an exponential tilting of the flowbased model. Wemore »Free, publiclyaccessible full text available January 1, 2023

A prerequisite for social coordination is bidirectional communication between teammates, each playing two roles simultaneously: as receptive listeners and expressive speakers. For robots working with humans in complex situations with multiple goals that differ in importance, failure to fulfill the expectation of either role could undermine group performance due to misalignment of values between humans and robots. Specifically, a robot needs to serve as an effective listener to infer human users’ intents from instructions and feedback and as an expressive speaker to explain its decision processes to users. Here, we investigate how to foster effective bidirectional humanrobot communications in themore »Free, publiclyaccessible full text available January 1, 2023

Latent space EnergyBased Models (EBMs), also known as energybased priors, have drawn growing interests in generative modeling. Fueled by its flexibility in the formulation and strong modeling power of the latent space, recent works built upon it have made interesting attempts aiming at the interpretability of text modeling. However, latent space EBMs also inherit some flaws from EBMs in data space; the degenerate MCMC sampling quality in practice can lead to poor generation quality and instability in training, especially on data with complex latent structures. Inspired by the recent efforts that leverage diffusion recovery likelihood learning as a cure formore »Free, publiclyaccessible full text available January 1, 2023

We propose to learn energybased model (EBM) in the latent space of a generator model, so that the EBM serves as a prior model that stands on the topdown networkofthegeneratormodel. BoththelatentspaceEBMandthetopdown network can be learned jointly by maximum likelihood, which involves shortrun MCMC sampling from both the prior and posterior distributions of the latent vector. Due to the low dimensionality of the latent space and the expressiveness of the topdown network, a simple EBM in latent space can capture regularities in the data effectively, and MCMC sampling in latent space is efficient and mixes well. We show that the learnedmore »