skip to main content

Title: Learning Overcomplete, Low Coherence Dictionaries with Linear Inference
Finding overcomplete latent representations of data has applications in data analysis, signal processing, machine learning, theoretical neuroscience and many other fields. In an overcomplete representation, the number of latent features exceeds the data dimensionality, which is useful when the data is undersampled by the measurements (compressed sensing or information bottlenecks in neural systems) or composed from multiple complete sets of linear features, each spanning the data space. Independent Components Analysis (ICA) is a linear technique for learning sparse latent representations, which typically has a lower computational cost than sparse coding, a linear generative model which requires an iterative, nonlinear inference step. While well suited for finding complete representations, we show that overcompleteness poses a challenge to existing ICA algorithms. Specifically, the coherence control used in existing ICA and other dictionary learning algorithms, necessary to prevent the formation of duplicate dictionary features, is ill-suited in the overcomplete case. We show that in the overcomplete case, several existing ICA algorithms have undesirable global minima that maximize coherence. We provide a theoretical explanation of these failures and, based on the theory, propose improved coherence control costs for overcomplete ICA algorithms. Further, by comparing ICA algorithms to the computationally more expensive sparse coding on more » synthetic data, we show that the limited applicability of overcomplete, linear inference can be extended with the proposed cost functions. Finally, when trained on natural images, we show that the coherence control biases the exploration of the data manifold, sometimes yielding suboptimal, coherent solutions. All told, this study contributes new insights into and methods for coherence control for linear ICA, some of which are applicable to many other nonlinear models. « less
; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of machine learning research
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. The optic nerve transmits visual information to the brain as trains of discrete events, a low-power, low-bandwidth communication channel also exploited by silicon retina cameras. Extracting highfidelity visual input from retinal event trains is thus a key challenge for both computational neuroscience and neuromorphic engineering. Here, we investigate whether sparse coding can enable the reconstruction of high-fidelity images and video from retinal event trains. Our approach is analogous to compressive sensing, in which only a random subset of pixels are transmitted and the missing information is estimated via inference. We employed a variant of the Locally Competitive Algorithm to infer sparse representations from retinal event trains, using a dictionary of convolutional features optimized via stochastic gradient descent and trained in an unsupervised manner using a local Hebbian learning rule with momentum. We used an anatomically realistic retinal model with stochastic graded release from cones and bipolar cells to encode thumbnail images as spike trains arising from ON and OFF retinal ganglion cells. The spikes from each model ganglion cell were summed over a 32 msec time window, yielding a noisy rate-coded image. Analogous to how the primary visual cortex is postulated to infer features from noisy spike trains arising frommore »the optic nerve, we inferred a higher-fidelity sparse reconstruction from the noisy rate-coded image using a convolutional dictionary trained on the original CIFAR10 database. To investigate whether a similar approachworks on non-stochastic data, we demonstrate that the same procedure can be used to reconstruct high-frequency video from the asynchronous events arising from a silicon retina camera moving through a laboratory environment.« less
  2. Abstract We describe a stochastic, dynamical system capable of inference and learning in a probabilistic latent variable model. The most challenging problem in such models—sampling the posterior distribution over latent variables—is proposed to be solved by harnessing natural sources of stochasticity inherent in electronic and neural systems. We demonstrate this idea for a sparse coding model by deriving a continuous-time equation for inferring its latent variables via Langevin dynamics. The model parameters are learned by simultaneously evolving according to another continuous-time equation, thus bypassing the need for digital accumulators or a global clock. Moreover, we show that Langevin dynamics lead to an efficient procedure for sampling from the posterior distribution in the L0 sparse regime, where latent variables are encouraged to be set to zero as opposed to having a small L1 norm. This allows the model to properly incorporate the notion of sparsity rather than having to resort to a relaxed version of sparsity to make optimization tractable. Simulations of the proposed dynamical system on both synthetic and natural image data sets demonstrate that the model is capable of probabilistically correct inference, enabling learning of the dictionary as well as parameters of the prior.
  3. Causal discovery witnessed significant progress over the past decades. In particular,many recent causal discovery methods make use of independent, non-Gaussian noise to achieve identifiability of the causal models. Existence of hidden direct common causes, or confounders, generally makes causal discovery more difficult;whenever they are present, the corresponding causal discovery algorithms canbe seen as extensions of overcomplete independent component analysis (OICA). However, existing OICA algorithms usually make strong parametric assumptions on the distribution of independent components, which may be violated on real data, leading to sub-optimal or even wrong solutions. In addition, existing OICA algorithms rely on the Expectation Maximization (EM) procedure that requires computationally expensive inference of the posterior distribution of independent components. To tackle these problems, we present a Likelihood-Free Overcomplete ICA algorithm (LFOICA1) that estimates the mixing matrix directly byback-propagation without any explicit assumptions on the density function of independent components. Thanks to its computational efficiency, the proposed method makes a number of causal discovery procedures much more practically feasible.For illustrative purposes, we demonstrate the computational efficiency and efficacy of our method in two causal discovery tasks on both synthetic and real data.
  4. Abstract

    Measured intensity in high-energy monochromatic X-ray diffraction (HEXD) experiments provides information regarding the microstructure of the crystalline material under study. The location of intensity on an areal detector is determined by the lattice spacing and orientation of crystals so that changes in theheterogeneityof these quantities are reflected in the spreading of diffraction peaks over time. High temporal resolution of such dynamics can now be experimentally observed using technologies such as the mixed-mode pixel array detector (MM-PAD) which facilitates in situ dynamic HEXD experiments to study plasticity and its underlying mechanisms. In this paper, we define and demonstrate a feature computed directly from such diffraction time series data quantifying signal spread in a manner that is correlated with plastic deformation of the sample. A distinguishing characteristic of the analysis is the capability to describe the evolution from the distinct diffraction peaks of an undeformed alloy sample through to the non-uniform Debye–Scherrer rings developed upon significant plastic deformation. We build on our previous work modeling data using an overcomplete dictionary by treating temporal measurements jointly to improve signal spread recovery. We demonstrate our approach in simulations and on experimental HEXD measurements captured using the MM-PAD. Our method for characterizing the temporalmore »evolution of signal spread is shown to provide an informative means of data analysis that adds to the capabilities of existing methods. Our work draws on ideas from convolutional sparse coding and requires solving a coupled convex optimization problem based on the alternating direction method of multipliers.

    « less
  5. Refinement types enable lightweight verification of functional programs. Algorithms for statically inferring refinement types typically work by reduction to solving systems of constrained Horn clauses extracted from typing derivations. An example is Liquid type inference, which solves the extracted constraints using predicate abstraction. However, the reduction to constraint solving in itself already signifies an abstraction of the program semantics that affects the precision of the overall static analysis. To better understand this issue, we study the type inference problem in its entirety through the lens of abstract interpretation. We propose a new refinement type system that is parametric with the choice of the abstract domain of type refinements as well as the degree to which it tracks context-sensitive control flow information. We then derive an accompanying parametric inference algorithm as an abstract interpretation of a novel data flow semantics of functional programs. We further show that the type system is sound and complete with respect to the constructed abstract semantics. Our theoretical development reveals the key abstraction steps inherent in refinement type inference algorithms. The trade-off between precision and efficiency of these abstraction steps is controlled by the parameters of the type system. Existing refinement type systems and their respectivemore »inference algorithms, such as Liquid types, are captured by concrete parameter instantiations. We have implemented our framework in a prototype tool and evaluated it for a range of new parameter instantiations (e.g., using octagons and polyhedra for expressing type refinements). The tool compares favorably against other existing tools. Our evaluation indicates that our approach can be used to systematically construct new refinement type inference algorithms that are both robust and precise.« less