skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Self-supervised clustering of mass spectrometry imaging data using contrastive learning
Mass spectrometry imaging (MSI) is widely used for the label-free molecular mapping of biological samples. The identification of co-localized molecules in MSI data is crucial to the understanding of biochemical pathways. One of key challenges in molecular colocalization is that complex MSI data are too large for manual annotation but too small for training deep neural networks. Herein, we introduce a self-supervised clustering approach based on contrastive learning, which shows an excellent performance in clustering of MSI data. We train a deep convolutional neural network (CNN) using MSI data from a single experiment without manual annotations to effectively learn high-level spatial features from ion images and classify them based on molecular colocalizations. We demonstrate that contrastive learning generates ion image representations that form well-resolved clusters. Subsequent self-labeling is used to fine-tune both the CNN encoder and linear classifier based on confidently classified ion images. This new approach enables autonomous and high-throughput identification of co-localized species in MSI data, which will dramatically expand the application of spatial lipidomics, metabolomics, and proteomics in biological research.  more » « less
Award ID(s):
2108729
PAR ID:
10323724
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Chemical Science
Volume:
13
Issue:
1
ISSN:
2041-6520
Page Range / eLocation ID:
90 to 98
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Simultaneous spatial localization and structural characterization of molecules in complex biological samples currently represents an analytical challenge for mass spectrometry imaging (MSI) techniques. In this study, we describe a novel experimental platform, which substantially expands the capabilities and enhances the depth of chemical information obtained in high spatial resolution MSI experiments performed using nanospray desorption electrospray ionization (nano-DESI). Specifically, we designed and constructed a portable nano-DESI MSI platform and coupled it with a drift tube ion mobility spectrometer-mass spectrometer (IM-MS). Separation of biomolecules observed in MSI experiments based on their drift times provides unique molecular descriptors necessary for their identification by comparison with databases. Furthermore, it enables isomer-specific imaging, which is particularly important for unraveling the complexity of biological systems. Imaging of day 4 pregnant mouse uterine sections using the newly developed nano-DESI-IM-MSI system demonstrates rapid isobaric and isomeric separation and reduced chemical noise in MSI experiments. A direct comparison of the performance of the new nano-DESI-MSI platform operated in the MS mode with the more established nano-DESI-Orbitrap platform indicates a comparable performance of these two systems. A spatial resolution of better than ~16 μm and similar molecular coverage was obtained using both platforms. The structural information provided by the ion mobility separation expands the molecular specificity of high-resolution MSI necessary for the detailed understanding of biological systems. 
    more » « less
  2. Messinger, David W.; Velez-Reyes, Miguel (Ed.)
    Recently, multispectral and hyperspectral data fusion models based on deep learning have been proposed to generate images with a high spatial and spectral resolution. The general objective is to obtain images that improve spatial resolution while preserving high spectral content. In this work, two deep learning data fusion techniques are characterized in terms of classification accuracy. These methods fuse a high spatial resolution multispectral image with a lower spatial resolution hyperspectral image to generate a high spatial-spectral hyperspectral image. The first model is based on a multi-scale long short-term memory (LSTM) network. The LSTM approach performs the fusion using a multiple step process that transitions from low to high spatial resolution using an intermediate step capable of reducing spatial information loss while preserving spectral content. The second fusion model is based on a convolutional neural network (CNN) data fusion approach. We present fused images using four multi-source datasets with different spatial and spectral resolutions. Both models provide fused images with increased spatial resolution from 8m to 1m. The obtained fused images using the two models are evaluated in terms of classification accuracy on several classifiers: Minimum Distance, Support Vector Machines, Class-Dependent Sparse Representation and CNN classification. The classification results show better performance in both overall and average accuracy for the images generated with the multi-scale LSTM fusion over the CNN fusion 
    more » « less
  3. Motivation: Mass spectrometry imaging (MSI) characterizes the molecular composition of tissues at spatial resolution, and has a strong potential for distinguishing tissue types, or disease states. This can be achieved by supervised classification, which takes as input MSI spectra, and assigns class labels to subtissue locations. Unfortunately, developing such classifiers is hindered by the limited availability of training sets with subtissue labels as the ground truth. Subtissue labeling is prohibitively expensive, and only rough annotations of the entire tissues are typically available. Classifiers trained on data with approximate labels have sub-optimal performance. Results: To alleviate this challenge, we contribute a semi-supervised approach mi-CNN. mi-CNN implements multiple instance learning with a convolutional neural network (CNN). The multiple instance aspect enables weak supervision from tissue-level annotations when classifying subtissue locations. The convolutional architecture of the CNN captures contextual dependencies between the spectral features. Evaluations on simulated and experimental datasets demonstrated that mi-CNN improved the subtissue classification as compared to traditional classifiers. We propose mi-CNN as an important step towards accurate subtissue classification in MSI, enabling rapid distinction between tissue types and disease states. 
    more » « less
  4. null (Ed.)
    Bacteria identification can be a time-consuming process. Machine learning algorithms that use deep convolutional neural networks (CNNs) provide a promising alternative. Here, we present a deep learning based approach paired with Raman spectroscopy to rapidly and accurately detect the identity of a bacteria class. We propose a simple 4-layer CNN architecture and use a 30-class bacteria isolate dataset for training and testing. We achieve an identification accuracy of around 86% with identification speeds close to real-time. This optical/biological detection method is promising for applications in the detection of microbes in liquid biopsies and concentrated environmental liquid samples, where fast and accurate detection is crucial. This study uses a recently published dataset of Raman spectra from bacteria samples and an improved CNN model built with TensorFlow. Results show improved identification accuracy and reduced network complexity. 
    more » « less
  5. Recent deep clustering algorithms take advantage of self-supervised learning and self-training techniques to map the original data into a latent space, where the data embedding and clustering assignment can be jointly optimized. However, as many recent datasets are enormous and noisy, getting a clear boundary between different clusters is challenging with existing methods that mainly focus on contracting similar samples together and overlooking samples near boundary of clusters in the latent space. In this regard, we propose an end-to-end deep clustering algorithm, i.e., Locally Normalized Soft Contrastive Clustering (LNSCC). It takes advantage of similarities among each sample’s local neighborhood and globally disconnected samples to leverage positiveness and negativeness of sample pairs in a contrastive way to separate different clusters. Experimental results on various datasets illustrate that our proposed approach achieves outstanding clustering performance over most of the state-of-the-art clustering methods for both image and non-image data even without convolution. 
    more » « less