skip to main content


Title: Similarity Learning and Generalization with Limited Data: A Reservoir Computing Approach
We investigate the ways in which a machine learning architecture known as Reservoir Computing learns concepts such as “similar” and “different” and other relationships between image pairs and generalizes these concepts to previously unseen classes of data. We present two Reservoir Computing architectures, which loosely resemble neural dynamics, and show that a Reservoir Computer (RC) trained to identify relationships between image pairs drawn from a subset of training classes generalizes the learned relationships to substantially different classes unseen during training. We demonstrate our results on the simple MNIST handwritten digit database as well as a database of depth maps of visual scenes in videos taken from a moving camera. We consider image pair relationships such as images from the same class; images from the same class with one image superposed with noise, rotated 90°, blurred, or scaled; images from different classes. We observe that the reservoir acts as a nonlinear filter projecting the input into a higher dimensional space in which the relationships are separable; i.e., the reservoir system state trajectories display different dynamical patterns that reflect the corresponding input pair relationships. Thus, as opposed to training in the entire high-dimensional reservoir space, the RC only needs to learns characteristic features of these dynamical patterns, allowing it to perform well with very few training examples compared with conventional machine learning feed-forward techniques such as deep learning. In generalization tasks, we observe that RCs perform significantly better than state-of-the-art, feed-forward, pair-based architectures such as convolutional and deep Siamese Neural Networks (SNNs). We also show that RCs can not only generalize relationships, but also generalize combinations of relationships, providing robust and effective image pair classification. Our work helps bridge the gap between explainable machine learning with small datasets and biologically inspired analogy-based learning, pointing to new directions in the investigation of learning processes.  more » « less
Award ID(s):
1632976
NSF-PAR ID:
10287449
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Complexity
Volume:
2018
ISSN:
1076-2787
Page Range / eLocation ID:
1 to 15
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Puyol Anton, E ; Pop, M ; Sermesant, M ; Campello, V ; Lalande, A ; Lekadir, K ; Suinesiaputra, A ; Camara, O ; Young, A (Ed.)
    Cardiac cine magnetic resonance imaging (CMRI) is the reference standard for assessing cardiac structure as well as function. However, CMRI data presents large variations among different centers, vendors, and patients with various cardiovascular diseases. Since typical deep-learning-based segmentation methods are usually trained using a limited number of ground truth annotations, they may not generalize well to unseen MR images, due to the variations between the training and testing data. In this study, we proposed an approach towards building a generalizable deep-learning-based model for cardiac structure segmentations from multi-vendor,multi-center and multi-diseases CMRI data. We used a novel combination of image augmentation and a consistency loss function to improve model robustness to typical variations in CMRI data. The proposed image augmentation strategy leverages un-labeled data by a) using CycleGAN to generate images in different styles and b) exchanging the low-frequency features of images from different vendors. Our model architecture was based on an attention-gated U-Net model that learns to focus on cardiac structures of varying shapes and sizes while suppressing irrelevant regions. The proposed augmentation and consistency training method demonstrated improved performance on CMRI images from new vendors and centers. When evaluated using CMRI data from 4 vendors and 6 clinical center, our method was generally able to produce accurate segmentations of cardiac structures. 
    more » « less
  2. null (Ed.)
    The problem of learning to generalize on unseen classes during the training step, also known as few-shot classification, has attracted considerable attention. Initialization based methods, such as the gradient-based model agnostic meta-learning (MAML) [1], tackle the few-shot learning problem by “learning to fine-tune”. The goal of these approaches is to learn proper model initialization so that the classifiers for new classes can be learned from a few labeled examples with a small number of gradient update steps. Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks [2]. Learning fairly with unbiased outcomes is another significant hallmark of human intelligence, which is rarely touched in few-shot meta-learning. In this work, we propose a novel Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples based on data from related tasks. The key idea is to learn a good initialization of a fair model’s primal and dual parameters so that it can adapt to a new fair learning task via a few gradient update steps. Instead of manually tuning the dual parameters as hyperparameters via a grid search, PDFM optimizes the initialization of the primal and dual parameters jointly for fair meta-learning via a subgradient primal-dual approach. We further instantiate an example of bias controlling using decision boundary covariance (DBC) [3] as the fairness constraint for each task, and demonstrate the versatility of our proposed approach by applying it to classification on a variety of three real-world datasets. Our experiments show substantial improvements over the best prior work for this setting. 
    more » « less
  3. Airborne remote sensing offers unprecedented opportunities to efficiently monitor vegetation, but methods to delineate and classify individual plant species using the collected data are still actively being developed and improved. The Integrating Data science with Trees and Remote Sensing (IDTReeS) plant identification competition openly invited scientists to create and compare individual tree mapping methods. Participants were tasked with training taxon identification algorithms based on two sites, to then transfer their methods to a third unseen site, using field-based plant observations in combination with airborne remote sensing image data products from the National Ecological Observatory Network (NEON). These data were captured by a high resolution digital camera sensitive to red, green, blue (RGB) light, hyperspectral imaging spectrometer spanning the visible to shortwave infrared wavelengths, and lidar systems to capture the spectral and structural properties of vegetation. As participants in the IDTReeS competition, we developed a two-stage deep learning approach to integrate NEON remote sensing data from all three sensors and classify individual plant species and genera. The first stage was a convolutional neural network that generates taxon probabilities from RGB images, and the second stage was a fusion neural network that “learns” how to combine these probabilities with hyperspectral and lidar data. Our two-stage approach leverages the ability of neural networks to flexibly and automatically extract descriptive features from complex image data with high dimensionality. Our method achieved an overall classification accuracy of 0.51 based on the training set, and 0.32 based on the test set which contained data from an unseen site with unknown taxa classes. Although transferability of classification algorithms to unseen sites with unknown species and genus classes proved to be a challenging task, developing methods with openly available NEON data that will be collected in a standardized format for 30 years allows for continual improvements and major gains for members of the computational ecology community. We outline promising directions related to data preparation and processing techniques for further investigation, and provide our code to contribute to open reproducible science efforts. 
    more » « less
  4. Confocal microscopy is a standard approach for obtaining volumetric images of a sample with high axial and lateral resolution, especially when dealing with scattering samples. Unfortunately, a confocal microscope is quite expensive compared to traditional microscopes. In addition, the point scanning in confocal microscopy leads to slow imaging speed and photobleaching due to the high dose of laser energy. In this paper, we demonstrate how the advances in machine learning can be exploited to teach a traditional wide-field microscope, one that’s available in every lab, into producing 3D volumetric images like a confocal microscope. The key idea is to obtain multiple images with different focus settings using a wide-field microscope and use a 3D generative adversarial network (GAN) based neural network to learn the mapping between the blurry low-contrast image stacks obtained using a wide-field microscope and the sharp, high-contrast image stacks obtained using a confocal microscope. After training the network with widefield-confocal stack pairs, the network can reliably and accurately reconstruct 3D volumetric images that rival confocal images in terms of its lateral resolution, z-sectioning and image contrast. Our experimental results demonstrate generalization ability to handle unseen data, stability in the reconstruction results, high spatial resolution even when imaging thick (∼40 microns) highly-scattering samples. We believe that such learning-based microscopes have the potential to bring confocal imaging quality to every lab that has a wide-field microscope.

     
    more » « less
  5. null (Ed.)
    Robotic manipulation of deformable 1D objects such as ropes, cables, and hoses is challenging due to the lack of high-fidelity analytic models and large configuration spaces. Furthermore, learning end-to-end manipulation policies directly from images and physical interaction requires significant time on a robot and can fail to generalize across tasks. We address these challenges using interpretable deep visual representations for rope, extending recent work on dense object descriptors for robot manipulation. This facilitates the design of interpretable and transferable geometric policies built on top of the learned representations, decoupling visual reasoning and control. We present an approach that learns point-pair correspondences between initial and goal rope configurations, which implicitly encodes geometric structure, entirely in simulation from synthetic depth images. We demonstrate that the learned representation - dense depth object descriptors (DDODs) - can be used to manipulate a real rope into a variety of different arrangements either by learning from demonstrations or using interpretable geometric policies. In 50 trials of a knot-tying task with the ABB YuMi Robot, the system achieves a 66% knot-tying success rate from previously unseen configurations. See https://tinyurl.com/rope-learning for supplementary material and videos. 
    more » « less