skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1707400

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters has often been challenging precisely due to their unsupervised nature. Meanwhile, in many real‐world scenarios, there are some noisy supervising auxiliary variables, for instance, subjective diagnostic opinions, that are related to the observed heterogeneity of the unlabeled data. By leveraging information from both supervising auxiliary variables and unlabeled data, we seek to uncover more scientifically interpretable group structures that may be hidden by completely unsupervised analyses. In this work, we propose and develop a new statistical pattern discovery method named supervised convex clustering (SCC) that borrows strength from both information sources and guides towards finding more interpretable patterns via a joint convex fusion penalty. We develop several extensions of SCC to integrate different types of supervising auxiliary variables, to adjust for additional covariates, and to find biclusters. We demonstrate the practical advantages of SCC through simulations and a case study on Alzheimer's disease genomics. Specifically, we discover new candidate genes as well as new subtypes of Alzheimer's disease that can potentially lead to better understanding of the underlying genetic mechanisms responsible for the observed heterogeneity of cognitive decline in older adults. 
    more » « less
  2. Summary Structural learning of Gaussian graphical models in the presence of latent variables has long been a challenging problem. Chandrasekaran et al. (2012) proposed a convex program for estimating a sparse graph plus a low-rank term that adjusts for latent variables; however, this approach poses challenges from both computational and statistical perspectives. We propose an alternative, simple solution: apply a hard-thresholding operator to existing graph selection methods. Conceptually simple and computationally attractive, the approach of thresholding the graphical lasso is shown to be graph selection consistent in the presence of latent variables under a simpler minimum edge strength condition and at an improved statistical rate. The results are extended to estimators for thresholded neighbourhood selection and constrained $$\ell_{1}$$-minimization for inverse matrix estimation as well. We show that our simple thresholded graph estimators yield stronger empirical results than existing methods for the latent variable graphical model problem, and we apply them to a neuroscience case study on estimating functional neural connections. 
    more » « less
  3. In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights. 
    more » « less
  4. Abstract Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities – inherited from over 500 million years of evolution – that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI. 
    more » « less
  5. Success in many real-world tasks depends on our ability to dynamically track hidden states of the world. We hypothesized that neural populations estimate these states by processing sensory history through recurrent interactions which reflect the internal model of the world. To test this, we recorded brain activity in posterior parietal cortex (PPC) of monkeys navigating by optic flow to a hidden target location within a virtual environment, without explicit position cues. In addition to sequential neural dynamics and strong interneuronal interactions, we found that the hidden state - monkey’s displacement from the goal - was encoded in single neurons, and could be dynamically decoded from population activity. The decoded estimates predicted navigation performance on individual trials. Task manipulations that perturbed the world model induced substantial changes in neural interactions, and modified the neural representation of the hidden state, while representations of sensory and motor variables remained stable. The findings were recapitulated by a task-optimized recurrent neural network model, suggesting that task demands shape the neural interactions in PPC, leading them to embody a world model that consolidates information and tracks task-relevant hidden states. 
    more » « less
  6. Two facts about cortex are widely accepted: neuronal responses show large spiking variability with near Poisson statistics and cortical circuits feature abundant recurrent connections between neurons. How these spiking and circuit properties combine to support sensory representation and information processing is not well understood. We build a theoretical framework showing that these two ubiquitous features of cortex combine to produce optimal sampling-based Bayesian inference. Recurrent connections store an internal model of the external world, and Poissonian variability of spike responses drives flexible sampling from the posterior stimulus distributions obtained by combining feedforward and recurrent neuronal inputs. We illustrate how this framework for sampling-based inference can be used by cortex to represent latent multivariate stimuli organized either hierarchically or in parallel. A neural signature of such network sampling are internally generated differential correlations whose amplitude is determined by the prior stored in the circuit, which provides an experimentally testable prediction for our framework. 
    more » « less
  7. Probabilistic graphical models provide a powerful tool to describe complex statistical structure, with many real-world applications in science and engineering from controlling robotic arms to understanding neuronal computations. A major challenge for these graphical models is that inferences such as marginalization are intractable for general graphs. These inferences are often approximated by a distributed message-passing algorithm such as Belief Propagation, which does not always perform well on graphs with cycles, nor can it always be easily specified for complex continuous probability distributions. Such difficulties arise frequently in expressive graphical models that include intractable higher-order interactions. In this paper we define the Recurrent Factor Graph Neural Network (RF-GNN) to achieve fast approximate inference on graphical models that involve many-variable interactions. Experimental results on several families of graphical models demonstrate the out-of-distribution generalization capability of our method to different sized graphs, and indicate the domain in which our method outperforms Belief Propagation (BP). Moreover, we test the RF-GNN on a real-world Low-Density Parity-Check dataset as a benchmark along with other baseline models including BP variants and other GNN methods. Overall we find that RF-GNNs outperform other methods under high noise levels. 
    more » « less