Energybased models (EBMs) assign an unnormalized log probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning and many more. But, the training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoisingscore matching has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising. However, demonstrations of such models in the highquality sample synthesis of highdimensional data were lacking. Recently, a paper showed that a generative model trained by denoisingscore matching accomplishes excellent sample synthesis when trained with data samples corrupted with multiple levels of noise. Here we provide an analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multiscale denoisingscore matching. Our model exhibits a datageneration performance comparable to stateoftheart techniques such as GANs and sets a new baseline for EBMs. The proposed model also provides density information and performs well on an imageinpainting task.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available October 1, 2024

Abstract A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage secondorder interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higherorder interactions. Here, we demonstrate that higherorder Ising machines can solve satisfiability problems more resourceefficiently in terms of the number of spin variables and their connections when compared to traditional secondorder Ising machines. Further, our results show on a benchmark dataset of Boolean
k satisfiability problems that higherorder Ising machines implemented with coupled oscillators rapidly find solutions that are better than secondorder Ising machines, thus, improving the current stateoftheart for Ising machines.Free, publiclyaccessible full text available December 1, 2024 
We investigate the task of retrieving information from compositional distributed representations formed by hyperdimensional computing/vector symbolic architectures and present novel techniques that achieve new information rate bounds. First, we provide an overview of the decoding techniques that can be used to approach the retrieval task. The techniques are categorized into four groups. We then evaluate the considered techniques in several settings that involve, for example, inclusion of external noise and storage elements with reduced precision. In particular, we find that the decoding techniques from the sparse coding and compressed sensing literature (rarely used for hyperdimensional computing/vector symbolic architectures) are also well suited for decoding information from the compositional distributed representations.Combining these decoding techniqueswith interference cancellation ideas from communications improves previously reported bounds (Hersche et al., 2021) of the information rate of the distributed representations from 1.20 to 1.40 bits per dimension for smaller codebooks and from 0.60 to 1.26 bits per dimension for larger codebooks.more » « less

Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the onelayer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. Furthermore, the theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. Moreover, the perceptron theory compares favorably to other methods that do not rely on training an estimator model.more » « less

This article reviews recent progress in the development of the computing framework Vector Symbolic Architectures (also known as Hyperdimensional Computing). This framework is well suited for implementation in stochastic, nanoscale hardware and it naturally expresses the types of cognitive operations required for Artificial Intelligence (AI). We demonstrate in this article that the ringlike algebraic structure of Vector Symbolic Architectures offers simple but powerful operations on highdimensional vectors that can support all data structures and manipulations relevant in modern computing. In addition, we illustrate the distinguishing feature of Vector Symbolic Architectures, “computing in superposition,” which sets it apart from conventional computing. This latter property opens the door to efficient solutions to the difficult combinatorial search problems inherent in AI applications. Vector Symbolic Architectures are Turing complete, as we show, and we see them acting as a framework for computing with distributed representations in myriad AI settings. This paper serves as a reference for computer architects by illustrating techniques and philosophy of VSAs for distributed computing and relevance to emerging computing hardware, such as neuromorphic computing.more » « less

null (Ed.)Variable binding is a cornerstone of symbolic reasoning and cognition. But how binding can be implemented in connectionist models has puzzled neuroscientists, cognitive psychologists, and neural network researchers for many decades. One type of connectionist model that naturally includes a binding operation is vector symbolic architectures (VSAs). In contrast to other proposals for variable binding, the binding operation in VSAs is dimensionalitypreserving, which enables representing complex hierarchical data structures, such as trees, while avoiding a combinatoric expansion of dimensionality. Classical VSAs encode symbols by dense randomized vectors, in which information is distributed throughout the entire neuron population. By contrast, in the brain, features are encoded more locally, by the activity of single neurons or small groups of neurons, often forming sparse vectors of neural activation. Following Laiho et al. (2015), we explore symbolic reasoning with a special case of sparse distributed representations. Using techniques from compressed sensing, we first show that variable binding in classical VSAs is mathematically equivalent to tensor product binding between sparse feature vectors, another wellknown binding operation which increases dimensionality. This theoretical result motivates us to study two dimensionalitypreserving binding methods that include a reduction of the tensor matrix into a single sparse vector. One binding method for general sparse vectors uses random projections, the other, blocklocal circular convolution, is defined for sparse vectors with block structure, sparse blockcodes. Our experiments reveal that blocklocal circular convolution binding has ideal properties, whereas random projection based binding also works, but is lossy. We demonstrate in example applications that a VSA with blocklocal circular convolution and sparse blockcodes reaches similar performance as classical VSAs. Finally, we discuss our results in the context of neuroscience and neural networks.more » « less

null (Ed.)Vector space models for symbolic processing that encode symbols by random vectors have been proposed in cognitive science and connectionist communities under the names Vector Symbolic Architecture (VSA), and, synonymously, Hyperdimensional (HD) computing. In this paper, we generalize VSAs to function spaces by mapping continuousvalued data into a vector space such that the inner product between the representations of any two data points represents a similarity kernel. By analogy to VSA, we call this new function encoding and computing framework Vector Function Architecture (VFA). In VFAs, vectors can represent individual data points as well as elements of a function space (a reproducing kernel Hilbert space). The algebraic vector operations, inherited from VSA, correspond to welldefined operations in function space. Furthermore, we study a previously proposed method for encoding continuous data, fractional power encoding (FPE), which uses exponentiation of a random base vector to produce randomized representations of data points and fulfills the kernel properties for inducing a VFA. We show that the distribution from which elements of the base vector are sampled determines the shape of the FPE kernel, which in turn induces a VFA for computing with bandlimited functions. In particular, VFAs provide an algebraic framework for implementing largescale kernel machines with random features, extending Rahimi and Recht, 2007. Finally, we demonstrate several applications of VFA models to problems in image recognition, density estimation and nonlinear regression. Our analyses and results suggest that VFAs constitute a powerful new framework for representing and manipulating functions in distributed neural systems, with myriad applications in artificial intelligence.more » « less

null (Ed.)Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural networkbased samplers were trained with objectives that either did not explicitly encourage exploration, or contained a term that encouraged exploration but only for well structured distributions. Here we propose to maximize proposal entropy for adapting the proposal to distributions of any shape. To optimize proposal entropy directly, we devised a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieved significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks, sometimes by more than an order magnitude. Further, the sampler was demonstrated through the training of a convergent energybased model of natural images. The adaptive sampler achieved unbiased sampling with significantly higher proposal entropy than a Langevin dynamics sample. The trained sampler also achieved better sample quality.more » « less

Mounting evidence suggests that during conscious states, the electrodynamics of the cortex are poised near a critical point or phase transition and that this nearcritical behavior supports the vast flow of information through cortical networks during conscious states. Here, we empirically identify a mathematically specific critical point near which waking cortical oscillatory dynamics operate, which is known as the edgeofchaos critical point, or the boundary between stability and chaos. We do so by applying the recently developed modified 01 chaos test to electrocorticography (ECoG) and magnetoencephalography (MEG) recordings from the cortices of humans and macaques across normal waking, generalized seizure, anesthesia, and psychedelic states. Our evidence suggests that cortical information processing is disrupted during unconscious states because of a transition of lowfrequency cortical electric oscillations away from this critical point; conversely, we show that psychedelics may increase the information richness of cortical activity by tuning lowfrequency cortical oscillations closer to this critical point. Finally, we analyze clinical electroencephalography (EEG) recordings from patients with disorders of consciousness (DOC) and show that assessing the proximity of slow cortical oscillatory electrodynamics to the edgeofchaos critical point may be useful as an index of consciousness in the clinical setting.more » « less