skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Thursday, March 12 until 2:00 AM ET on Friday, March 13 due to maintenance. We apologize for the inconvenience.


Title: Assemblies of neurons learn to classify well-separated distributions
An assembly is a large population of neurons whose synchronous firing represents a memory, concept, word, and other cognitive category. Assemblies are believed to provide a bridge between high-level cognitive phenomena and low-level neural activity. Recently, a computational system called the \emph{Assembly Calculus} (AC), with a repertoire of biologically plausible operations on assemblies, has been shown capable of simulating arbitrary space-bounded computation, but also of simulating complex cognitive phenomena such as language, reasoning, and planning. However, the mechanism whereby assemblies can mediate {\em learning} has not been known. Here we present such a mechanism, and prove rigorously that, for simple classification problems defined on distributions of labeled assemblies, a new assembly representing each class can be reliably formed in response to a few stimuli from the class; this assembly is henceforth reliably recalled in response to new stimuli from the same class. Furthermore, such class assemblies will be distinguishable as long as the respective classes are reasonably separated — for example, when they are clusters of similar assemblies, or more generally separable with margin by a linear threshold function. To prove these results, we draw on random graph theory with dynamic edge weights to estimate sequences of activated vertices, yielding strong generalizations of previous calculations and theorems in this field over the past five years. These theorems are backed up by experiments demonstrating the successful formation of assemblies which represent concept classes on synthetic data drawn from such distributions, and also on MNIST, which lends itself to classification through one assembly per digit. Seen as a learning algorithm, this mechanism is entirely online, generalizes from very few samples, and requires only mild supervision — all key attributes of learning in a model of the brain. We argue that this learning mechanism, supported by separate sensory pre-processing mechanisms for extracting attributes, such as edges or phonemes, from real world data, can be the basis of biological learning in cortex.  more » « less
Award ID(s):
2134105 2007443 1909756
PAR ID:
10343851
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Conference on Learning Theory
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Even as machine learning exceeds human-level performance on many applications, the generality, robustness, and rapidity of the brain’s learning capabilities remain unmatched. How cognition arises from neural activity is the central open question in neuroscience, inextricable from the study of intelligence itself. A simple formal model of neural activity was proposed in Papadimitriou (2020) and has been subsequently shown, through both mathematical proofs and simulations, to be capable of implementing certain simple cognitive operations via the creation and manipulation of assemblies of neurons. However, many intelligent behaviors rely on the ability to recognize, store, and manipulate temporal sequences of stimuli (planning, language, navigation, to list a few). Here we show that, in the same model, time can be captured naturally as precedence through synaptic weights and plasticity, and, as a result, a range of computations on sequences of assemblies can be carried out. In particular, repeated presentation of a sequence of stimuli leads to the memorization of the sequence through corresponding neural assemblies: upon future presentation of any stimulus in the sequence, the corresponding assembly and its subsequent ones will be activated, one after the other, until the end of the sequence. If the stimulus sequence is presented to two brain areas simultaneously, a scaffolded representation is created, resulting in more efficient memorization and recall, in agreement with cognitive experiments. Finally, we show that any finite state machine can be learned in a similar way, through the presentation of appropriate patterns of sequences. Through an extension of this mechanism, the model can be shown to be capable of universal computation. We support our analysis with a number of experiments to probe the limits of learning in this model in key ways. Taken together, these results provide a concrete hypothesis for the basis of the brain’s remarkable abilities to compute and learn, with sequences playing a vital role. 
    more » « less
  2. Recent research in the theory of overparametrized learning has sought to establish generalization guarantees in the interpolating regime. Such results have been established for a few common classes of methods, but so far not for ensemble methods. We devise an ensemble classification method that simultaneously interpolates the training data, and is consistent for a broad class of data distributions. To this end, we define the manifold-Hilbert kernel for data distributed on a Riemannian manifold. We prove that kernel smoothing regression and classification using the manifold-Hilbert kernel are weakly consistent in the setting of Devroye et al. [19]. For the sphere, we show that the manifold-Hilbert kernel can be realized as a weighted random partition kernel, which arises as an infinite ensemble of partition-basedclassifiers. 
    more » « less
  3. Recent research in the theory of overparametrized learning has sought to establish generalization guarantees in the interpolating regime. Such results have been established for a few common classes of methods, but so far not for ensemble methods. We devise an ensemble classification method that simultaneously interpolates the training data, and is consistent for a broad class of data distributions. To this end, we define the manifold-Hilbert kernel for data distributed on a Riemannian manifold. We prove that kernel smoothing regression and classification using the manifold-Hilbert kernel are weakly consistent in the setting of Devroye et al. [22]. For the sphere, we show that the manifold-Hilbert kernel can be realized as a weighted random partition kernel, which arises as an infinite ensemble of partition-based classifiers. 
    more » « less
  4. Multigroup discriminant analysis is an important supervised learning technique in the classification framework, with applications in various disciplines. Its objective is to approximate underlying class distributions based on data attributes or features. After the class distributions are estimated, the classification task can be readily carried out for data points with unknown labels. Linear discriminant analysis (LDA) as well as quadratic discriminant analysis (QDA) are statistical procedures widely utilized by practitioners due to their practicality and generally good performance. Both procedures rely on the assumption of normally distributed classes and can be affected by deviations from multivariate normality. To address this model deficiency, we propose an extension of LDA and QDA that relies on the idea of transformation and can readily accommodate asymmetry and skewness in data classes. Through the set of simulation studies and applications to real-life data sets, we demonstrate that the developed technique is promising as it demonstrates superior performance over competitors in a variety of cases. 
    more » « less
  5. The task of few-shot graph classification aims to assign class labels to graph samples, where only a limited number of labeled graphs are provided for each class. To deal with the problem brought about by label scarcity, recent works have focused on adopting the prevalent few-shot learning framework to ensure fast adaptations to classes with limited labeled graphs. In general, these studies propose to accumulate meta-knowledge across various base classes with sufficient labeled graphs, and then generalize such meta-knowledge to novel classes, which are disjoint from base classes and consist of limited labeled graphs. However, existing studies generally ignore the distinct distribution shifts between base classes and novel classes, leading to unsatisfactory adaptation performance. On the other hand, it remains challenging to address this issue due to the potential variance in distributions between classes. To tackle this problem, we propose a novel generative few-shot graph classification framework that can promote adaptation performance by generating adaptive structures for graphs in novel classes. Our framework incorporates a generative model to modify the graph structures for adaptation. We further conduct extensive experiments to validate the effectiveness of our framework. 
    more » « less