skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sleep-Like Unsupervised Replay Improves Performance When Data Are Limited or Unbalanced (Student Abstract)
The performance of artificial neural networks (ANNs) degrades when training data are limited or imbalanced. In contrast, the human brain can learn quickly from just a few examples. Here, we investigated the role of sleep in improving the performance of ANNs trained with limited data on the MNIST and Fashion MNIST datasets. Sleep was implemented as an unsupervised phase with local Hebbian type learning rules. We found a significant boost in accuracy after the sleep phase for models trained with limited data in the range of 0.5-10% of total MNIST or Fashion MNIST datasets. When more than 10% of the total data was used, sleep alone had a slight negative impact on performance, but this was remedied by fine-tuning on the original data. This study sheds light on a potential synaptic weight dynamics strategy employed by the brain during sleep to enhance memory performance when training data are limited or imbalanced.  more » « less
Award ID(s):
2223839
PAR ID:
10544246
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
AAAI
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
38
Issue:
21
ISSN:
2159-5399
Page Range / eLocation ID:
23441 to 23442
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Artificial neural networks (ANNs) show limited performance with scarce or imbalanced training data and face challenges with continuous learning, such as forgetting previously learned data after new tasks training. In contrast, the human brain can learn continuously and from just a few examples. This research explores the impact of ’sleep’ an unsupervised phase incorporating stochastic network activation with local Hebbian learning rules on ANNs trained incrementally with limited and imbalanced datasets, specifically MNIST and Fashion MNIST. We discovered that introducing a sleep phase significantly enhanced accuracy in models trained with limited data. When a few tasks were trained sequentially, sleep replay not only rescued previously learned information that had been forgotten following new task training but also often enhanced performance in prior tasks, especially those trained with limited data. This study highlights the multifaceted role of sleep replay in augmenting learning efficiency and facilitating continual learning in ANNs. 
    more » « less
  2. State-of-the-art subspace clustering methods are based on the self-expressive model, which represents each data point as a linear combination of other data points. However, such methods are designed for a finite sample dataset and lack the ability to generalize to out-of-sample data. Moreover, since the number of self-expressive coefficients grows quadratically with the number of data points, their ability to handle large-scale datasets is often limited. In this paper, we propose a novel framework for subspace clustering, termed Self-Expressive Network (SENet), which employs a properly designed neural network to learn a self-expressive representation of the data. We show that our SENet can not only learn the self-expressive coefficients with desired properties on the training data, but also handle out-of-sample data. Besides, we show that SENet can also be leveraged to perform subspace clustering on large-scale datasets. Extensive experiments conducted on synthetic data and real world benchmark data validate the effectiveness of the proposed method. In particular, SENet yields highly competitive performance on MNIST, Fashion MNIST and Extended MNIST and state-of-the-art performance on CIFAR-10. 
    more » « less
  3. null (Ed.)
    Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the strong correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolution, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computational cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain. Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results without the use of batch normalization. We apply our network deconvolution operation to 10 modern neural network models by replacing batch normalization within each. Extensive experiments show that the network deconvolution operation is able to deliver performance improvement in all cases on the CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST, Cityscapes, and ImageNet datasets. 
    more » « less
  4. Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the strong correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolution, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computational cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain. Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results without the use of batch normalization. We apply our network deconvolution operation to 10 modern neural network models by replacing batch normalization within each. Extensive experiments show that the network deconvolution operation is able to deliver performance improvement in all cases on the CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST, Cityscapes, and ImageNet datasets. 
    more » « less
  5. Abstract Sleep is critical to a variety of cognitive functions and insufficient sleep can have negative consequences for mood and behavior across the lifespan. An important open question is how sleep duration is related to functional brain organization which may in turn impact cognition. To characterize the functional brain networks related to sleep across youth and young adulthood, we analyzed data from the publicly available Human Connectome Project (HCP) dataset, which includesn‐back task‐based and resting‐state fMRI data from adults aged 22–35 years (taskn = 896; restn = 898). We applied connectome‐based predictive modeling (CPM) to predict participants' mean sleep duration from their functional connectivity patterns. Models trained and tested using 10‐fold cross‐validation predicted self‐reported average sleep duration for the past month fromn‐back task and resting‐state connectivity patterns. We replicated this finding in data from the 2‐year follow‐up study session of the Adolescent Brain Cognitive Development (ABCD) Study, which also includesn‐back task and resting‐state fMRI for adolescents aged 11–12 years (taskn = 786; restn = 1274) as well as Fitbit data reflecting average sleep duration per night over an average duration of 23.97 days. CPMs trained and tested with 10‐fold cross‐validation again predicted sleep duration fromn‐back task and resting‐state functional connectivity patterns. Furthermore, demonstrating that predictive models are robust across independent datasets, CPMs trained on rest data from the HCP sample successfully generalized to predict sleep duration in the ABCD Study sample and vice versa. Thus, common resting‐state functional brain connectivity patterns reflect sleep duration in youth and young adults. 
    more » « less