skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Pivotal Auto-Encoder via Self-Normalizing ReLU
Sparse auto-encoders are useful for extracting low-dimensional representations from high-dimensional data. However, their performance degrades sharply when the input noise at test time differs from the noise employed during training. This limitation hinders the applicability of auto-encoders in real-world scenarios where the level of noise in the input is unpredictable. In this paper, we formalize single hidden layer sparse auto-encoders as a transform learning problem. Leveraging the transform modeling interpretation, we propose an optimization problem that leads to a predictive model invariant to the noise level at test time. In other words, the same pre-trained model is able to generalize to different noise levels. The proposed optimization algorithm, derived from the square root lasso, is translated into a new, computationally efficient auto-encoding architecture. After proving that our new method is invariant to the noise level, we evaluate our approach by training networks using the proposed architecture for denoising tasks. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise compared to commonly used architectures.  more » « less
Award ID(s):
2007649
PAR ID:
10548036
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE Transactions on Signal Processing
Date Published:
Journal Name:
IEEE Transactions on Signal Processing
Volume:
72
ISSN:
1053-587X
Page Range / eLocation ID:
3201 to 3212
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The goal of speech separation is to extract multiple speech sources from a single microphone recording. Recently, with the advancement of deep learning and availability of large datasets, speech separation has been formulated as a supervised learning problem. These approaches aim to learn discriminative patterns of speech, speakers, and background noise using a supervised learning algorithm, typically a deep neural network. A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal, referred to as label permutation ambiguity. Permutation ambiguity refers to the problem of determining the output-label assignment between the separated sources and the available single-speaker speech labels. Finding the best output-label assignment is required for calculation of separation error, which is later used for updating parameters of the model. Recently, Permutation Invariant Training (PIT) has been shown to be a promising solution in handling the label ambiguity problem. However, the overconfident choice of the output-label assignment by PIT results in a sub-optimal trained model. In this work, we propose a probabilistic optimization framework to address the inefficiency of PIT in finding the best output-label assignment. Our proposed method entitled trainable Softminimum PIT is then employed on the same Long-Short Term Memory (LSTM) architecture used in Permutation Invariant Training (PIT) speech separation method. The results of our experiments show that the proposed method outperforms conventional PIT speech separation significantly (p-value < 0.01) by +1dB in Signal to Distortion Ratio (SDR) and +1.5dB in Signal to Interference Ratio (SIR). 
    more » « less
  2. Abstract This study introduces a hybrid model that utilizes a model-based optimization method to generate training data and an artificial neural network (ANN)-based learning method to offer real-time exoskeleton support in lifting activities. For the model-based optimization method, the torque of the knee exoskeleton and the optimal lifting motion are predicted utilizing a two-dimensional (2D) human–exoskeleton model. The control points for exoskeleton motor current profiles and human joint angle profiles from cubic B-spline interpolation represent the design variables. Minimizing the square of the normalized human joint torque is considered as the cost function. Subsequently, the lifting optimization problem is tackled using a sequential quadratic programming (SQP) algorithm in sparse nonlinear optimizer (SNOPT). For the learning-based approach, the learning-based control model is trained using the general regression neural network (GRNN). The anthropometric parameters of the human subjects and lifting boundary postures are used as input parameters, while the control points for exoskeleton torque are treated as output parameters. Once trained, the learning-based control model can provide exoskeleton assistive torque in real time for lifting tasks. Two test subjects’ joint angles and ground reaction forces (GRFs) comparisons are presented between the experimental and simulation results. Furthermore, the utilization of exoskeletons significantly reduces activations of the four knee extensor and flexor muscles compared to lifting without the exoskeletons for both subjects. Overall, the learning-based control method can generate assistive torque profiles in real time and faster than the model-based optimal control approach. 
    more » « less
  3. Energy-based models (EBMs) assign an unnormalized log probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning and many more. But, the training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoising-score matching has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising. However, demonstrations of such models in the high-quality sample synthesis of high-dimensional data were lacking. Recently, a paper showed that a generative model trained by denoising-score matching accomplishes excellent sample synthesis when trained with data samples corrupted with multiple levels of noise. Here we provide an analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multiscale denoising-score matching. Our model exhibits a data-generation performance comparable to state-of-the-art techniques such as GANs and sets a new baseline for EBMs. The proposed model also provides density information and performs well on an image-inpainting task. 
    more » « less
  4. Out-of-distribution (OOD) generalization on graphs aims at dealing with scenarios where the test graph distribution differs from the training graph distributions. Compared to i.i.d. data like images, the OOD generalization problem on graph-structured data remains challenging due to the non-i.i.d. property and complex structural information on graphs. Recently, several works on graph OOD generalization have explored extracting invariant subgraphs that share crucial classification information across different distributions. Nevertheless, such a strategy could be suboptimal for entirely capturing the invariant information, as the extraction of discrete structures could potentially lead to the loss of invariant information or the involvement of spurious information. In this paper, we propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction. To address the challenge of optimization in the absence of optimal invariant subgraphs (i.e., ground truths), we derive a tractable form of the proposed GRM objective by introducing a latent causal variable, and its effectiveness is validated by our theoretical analysis. We further conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization, and the results demonstrate the superiority of our framework GRM. 
    more » « less
  5. High-dimensional data is commonly encountered in various applications, including genomics, as well as image and video processing. Analyzing, computing, and visualizing such data pose significant challenges. Feature extraction methods are crucial in addressing these challenges by obtaining compressed representations that are suitable for analysis and downstream tasks. One effective technique along these lines is sparse coding, which involves representing data as a sparse linear combination of a set of exemplars. In this study, we propose a local sparse coding framework within the context of a classification problem. The objective is to predict the label of a given data point based on labeled training data. The primary optimization problem encourages the representation of each data point using nearby exemplars. We leverage the optimized sparse representation coefficients to predict the label of a test data point by assessing its similarity to the sparse representations of the training data. The proposed framework is computationally efficient and provides interpretable sparse representations. To illustrate the practicality of our proposed framework, we apply it to agriculture for the classification of crop diseases. 
    more » « less