skip to main content


Title: Cell morphology-based machine learning models for human cell state classification
Herein, we implement and access machine learning architectures to ascertain models that differentiate healthy from apoptotic cells using exclusively forward (FSC) and side (SSC) scatter flow cytometry information. To generate training data, colorectal cancer HCT116 cells were subjected to miR-34a treatment and then classified using a conventional Annexin V/propidium iodide (PI)-staining assay. The apoptotic cells were defined as Annexin V-positive cells, which include early and late apoptotic cells, necrotic cells, as well as other dying or dead cells. In addition to fluorescent signal, we collected cell size and granularity information from the FSC and SSC parameters. Both parameters are subdivided into area, height, and width, thus providing a total of six numerical features that informed and trained our models. A collection of logistical regression, random forest, k-nearest neighbor, multilayer perceptron, and support vector machine was trained and tested for classification performance in predicting cell states using only the six aforementioned numerical features. Out of 1046 candidate models, a multilayer perceptron was chosen with 0.91 live precision, 0.93 live recall, 0.92 live f value and 0.97 live area under the ROC curve when applied on standardized data. We discuss and highlight differences in classifier performance and compare the results to the standard practice of forward and side scatter gating, typically performed to select cells based on size and/or complexity. We demonstrate that our model, a ready-to-use module for any flow cytometry-based analysis, can provide automated, reliable, and stain-free classification of healthy and apoptotic cells using exclusively size and granularity information.  more » « less
Award ID(s):
2029121
NSF-PAR ID:
10233562
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
npj systems biology and applications
Volume:
7
Issue:
23
ISSN:
2056-7189
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Herein, we implement and access machine learning architectures to ascertain models that differentiate healthy from apoptotic cells using exclusively forward (FSC) and side (SSC) scatter flow cytometry information. To generate training data, colorectal cancer HCT116 cells were subjected to miR-34a treatment and then classified using a conventional Annexin V/propidium iodide (PI)-staining assay. The apoptotic cells were defined as Annexin V-positive cells, which include early and late apoptotic cells, necrotic cells, as well as other dying or dead cells. In addition to fluorescent signal, we collected cell size and granularity information from the FSC and SSC parameters. Both parameters are subdivided into area, height, and width, thus providing a total of six numerical features that informed and trained our models. A collection of logistical regression, random forest, k-nearest neighbor, multilayer perceptron, and support vector machine was trained and tested for classification performance in predicting cell states using only the six aforementioned numerical features. Out of 1046 candidate models, a multilayer perceptron was chosen with 0.91 live precision, 0.93 live recall, 0.92 livefvalue and 0.97 live area under the ROC curve when applied on standardized data. We discuss and highlight differences in classifier performance and compare the results to the standard practice of forward and side scatter gating, typically performed to select cells based on size and/or complexity. We demonstrate that our model, a ready-to-use module for any flow cytometry-based analysis, can provide automated, reliable, and stain-free classification of healthy and apoptotic cells using exclusively size and granularity information.

     
    more » « less
  2. null (Ed.)
    Detection and quantification of bacterial endotoxins is important in a range of health-related contexts, including during pharmaceutical manufacturing of therapeutic proteins and vaccines. Here we combine experimental measurements based on nematic liquid crystalline droplets and machine learning methods to show that it is possible to classify bacterial sources ( Escherichia coli , Pseudomonas aeruginosa , Salmonella minnesota ) and quantify concentration of endotoxin derived from all three bacterial species present in aqueous solution. The approach uses flow cytometry to quantify, in a high-throughput manner, changes in the internal ordering of micrometer-sized droplets of nematic 4-cyano-4′-pentylbiphenyl triggered by the endotoxins. The changes in internal ordering alter the intensities of light side-scattered (SSC, large-angle) and forward-scattered (FSC, small-angle) by the liquid crystal droplets. A convolutional neural network (Endonet) is trained using the large data sets generated by flow cytometry and shown to predict endotoxin source and concentration directly from the FSC/SSC scatter plots. By using saliency maps, we reveal how EndoNet captures subtle differences in scatter fields to enable classification of bacterial source and quantification of endotoxin concentration over a range that spans eight orders of magnitude (0.01 pg mL −1 to 1 μg mL −1 ). We attribute changes in scatter fields with bacterial origin of endotoxin, as detected by EndoNet, to the distinct molecular structures of the lipid A domains of the endotoxins derived from the three bacteria. Overall, we conclude that the combination of liquid crystal droplets and EndoNet provides the basis of a promising analytical approach for endotoxins that does not require use of complex biologically-derived reagents ( e.g. , Limulus amoebocyte lysate). 
    more » « less
  3. Abstract

    Most cancer patients die from metastatic disease as a result of a circulating tumor cell (CTC) spreading from a primary tumor through the blood circulation to distant organs. Many studies have demonstrated the tremendous potential of using CTC counts as prognostic markers of metastatic development and therapeutic efficacy. However, it is only the viable CTCs capable of surviving in the blood circulation that can create distant metastasis. To date, little progress has been made in understanding what proportion of CTCs is viable and what proportion is in an apoptotic state. Here, we introduce a novel approach toward in situ characterization of CTC apoptosis status using a multicolor in vivo flow cytometry platform with fluorescent detection for the real‐time identification and enumeration of such cells directly in blood flow. The proof of concept was demonstrated with two‐color fluorescence flow cytometry (FFC) using breast cancer cells MDA‐MB‐231 expressing green fluorescein protein (GFP), staurosporine as an activator of apoptosis, Annexin‐V apoptotic kit with orange dye color, and a mouse model. The future application of this new platform for real‐time monitoring of antitumor drug efficiency is discussed. © 2019 International Society for Advancement of Cytometry

     
    more » « less
  4. ABSTRACT

    EseN is anEdwardsiella ictaluritype III secretion system effector with phosphothreonine lyase activity. In this work, we demonstrate that EseN inactivates p38 and c-Jun-N-terminal kinase (JNK) in infected head-kidney-derived macrophages (HKDMs). We have previously reported inactivation of extracellular-regulated kinase 1/2 (ERK1/2). Also, for the first time, we demonstrated that EseN is involved in the inactivation of 3-phosphoinositide-dependent kinase 1 (PDK1), which has not been previously demonstrated for any of the EseN homologs in other species. We also found that EseN significantly affected mRNA expression ofIL-10, pro-apoptoticbaxa, andp53, but had no significant effect on anti-apoptoticbcl2or pro-apoptotic apoptotic peptidase activating factor 1. EseN is also involved in the inhibition of caspase-8 and caspase-3/7 but does not affect caspase-9 activity. Repression of apoptosis was further confirmed with flow cytometry using Alexa Fluor 647-labeled annexin V and propidium iodide. In addition, we found that theE. ictaluriT3SS is essential for the inhibition of IL-1β maturation, but EseN is not involved in this process. EseN did not affect cell pyroptosis, as indicated by the lack of EseN impact on the release of lactate dehydrogenase from infected HKDM. The transmission electron microscopy data also indicate that HKDM infected with WT or aneseNmutant died by apoptosis, while HKDM infected with the T3SS mutant more likely died by pyroptosis. Collectively, our results indicate thatE. ictaluriEseN is involved in inactivation of ERK1/2, p38, JNK, and PDK1 signaling pathways that lead to modulation of cell death among infected HKDMs.

    IMPORTANCE

    This work has global significance in the catfish industry, which provides food for increasing global populations.E. ictaluriis a leading cause of disease loss, and EseN is an important player inE. ictalurivirulence. TheE. ictaluriT3SS effector EseN plays an essential role in establishing infection, but the specific role EseN plays is not well characterized. EseN belongs to a family of phosphothreonine lyase effectors that specifically target host mitogen activated protein kinase (MAPK) pathways important in regulating host responses to infection. No phosphothreonine lyase equivalents are known in eukaryotes, making this family of effectors an attractive target for indirect narrow-spectrum antibiotics. Targeting of major vault protein and PDK1 kinase by EseN has not been reported in EseN homologs in other pathogens and may indicate unique functions ofE. ictaluriEseN. EseN targeting of PDK1 is particularly interesting in that it is linked to an extraordinarily diverse group of cellular functions.

     
    more » « less
  5. Porcine reproductive and respiratory syndrome is an infectious disease of pigs caused by PRRS virus (PRRSV). A modified live-attenuated vaccine has been widely used to control the spread of PRRSV and the classification of field strains is a key for a successful control and prevention. Restriction fragment length polymorphism targeting the Open reading frame 5 (ORF5) genes is widely used to classify PRRSV strains but showed unstable accuracy. Phylogenetic analysis is a powerful tool for PRRSV classification with consistent accuracy but it demands large computational power as the number of sequences gets increased. Our study aimed to apply four machine learning (ML) algorithms, random forest, k-nearest neighbor, support vector machine and multilayer perceptron, to classify field PRRSV strains into four clades using amino acid scores based on ORF5 gene sequence. Our study used amino acid sequences of ORF5 gene in 1931 field PRRSV strains collected in the US from 2012 to 2020. Phylogenetic analysis was used to labels field PRRSV strains into one of four clades: Lineage 5 or three clades in Linage 1. We measured accuracy and time consumption of classification using four ML approaches by different size of gene sequences. We found that all four ML algorithms classify a large number of field strains in a very short time (<2.5 s) with very high accuracy (>0.99 Area under curve of the Receiver of operating characteristics curve). Furthermore, the random forest approach detects a total of 4 key amino acid positions for the classification of field PRRSV strains into four clades. Our finding will provide an insightful idea to develop a rapid and accurate classification model using genetic information, which also enables us to handle large genome datasets in real time or semi-real time for data-driven decision-making and more timely surveillance. 
    more » « less