skip to main content

Title: Phenotyping Immune Cells in Tumor and Healthy Tissue Using Flow Cytometry Data
We present an automated pipeline capable of distinguishing the phenotypes of myeloid-derived suppressor cells (MDSC) in healthy and tumor-bearing tissues in mice using flow cytometry data. In contrast to earlier work where samples are analyzed individually, we analyze all samples from each tissue collectively using a representative template for it. We demonstrate with 43 flow cytometry samples collected from three tissues, naive bone-marrow, spleens of tumor-bearing mice, and intra-peritoneal tumor, that a set of templates serves as a better classifier than popular machine learning approaches including support vector machines and neural networks. Our "interpretable machine learning" approach goes beyond classification and identifies distinctive phenotypes associated with each tissue, information that is clinically useful. Hence the pipeline presented here leads to better understanding of the maturation and differentiation of MDSCs using high-throughput data.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Mononuclear phagocytes comprise an array of tissue‐resident and monocyte‐derived cells with important roles in tissue homeostasis and resistance to infection. Their diverse phenotypes make functional characterization within tissues challenging, because multiple surface markers are typically required for subset identification and isolation by cell sorting methods. Analysis of SLAMF9 expression within renal mononuclear phagocyte populations by multi‐parametric flow cytometry indicates that SLAMF9 is a specific marker for identification of kidney‐resident CD45+CD11c+ MHC‐II+cells corresponding to prominent tissue‐resident MPC populations derived from dendritic cell progenitors in adult mice. High SLAMF9 expression was sufficient to identify and sort these cells from disaggregated tissue using a user‐operated cell sorter. The population can be further subdivided according to expression of CD11b and CD14 to identify IRF8highcDC1 cells and cleanly separate the CD11bhighF4/80lowand CD11bintF4/80highCD11c+MPC subsets. Therefore, SLAMF9 expression allows for the identification and sorting of kidney‐resident CD11b+CD11c+CD64+F4/80+CX3CR1+MHC‐II+MPCs without the need for complex antibody panels or reporter mice, simplifying isolation of these cells for study ex vivo.

    more » « less
  2. Tumor stiffness has been associated with malignancy and increased risk for metastasis. Extensive research has been done investigating breast cancer cell lines’ responsiveness to surfaces of varying rigidities as well as examining the biophysical properties of breast cancer tumor samples. However, there is a critical gap regarding the relationship between cells’ mechanosensitivity in conjunction to biophysical properties of their extracellular matrix environment. To explore this relationship, we will analyze single-cell mechanosensitivity in comparison to tumor rigidity via shearwave ultrasound elastogrophy (SWE). Given the putative affiliation, we hypothesize that cells expressing invasive mechanosensitivity profiles will correlate with stiffer tumor regions. Using collagen gels containing different cell types, we derived biopsy-sized samples allowing us to optimize single-cell mechanosensitivity analysis. Cells were stained using different dyes corresponding to invasiveness. Subsequently, we analyzed their morphology. Morphological identification within organoid environments would allow for single-cell analysis without the aggression of tissue digestion, though preliminary results suggest high heterogeneity may not allow for confident cell identification solely on morphology. Thus, inquisition into cell viability and integrity was explored by analyzing the effects of tissue digestion with HyQtase on single-cells. Cell count and live-dead stain via flow cytometry allowed for analysis of single-cell viability. Lastly, cell integrity was evaluated by a 2D adhesion assay of isolated cells. The live/dead stain revealed that digestion resulted in isolation of approximately 10% of the original 500,000 cell population with 90–97% of the isolated population being live-cells (invasive and non-invasive respectively). Furthermore, the adhesion assay showed that these isolated single cells retained the ability to adhere to new surfaces, with no difference between the invasive and non-invasive cell types. These results show that cells are able to retain mechanosensitive properties following enzymatic digestion. However, they also suggest our digestion procedure is not aggressive enough to isolate invasive subpopulations that are more strongly imbedded in the original tissues. Development of these novel techniques will allow for accurate and confident analysis of precious human biopsy samples. Insight into the relationship between single-cell mechanosensitivity and tumor biophysical properties could elucidate pathways for metastasis inhibition and prevention. 
    more » « less
  3. INTRODUCTION Diverse phenotypes, including large brains relative to body size, group living, and vocal learning ability, have evolved multiple times throughout mammalian history. These shared phenotypes may have arisen repeatedly by means of common mechanisms discernible through genome comparisons. RATIONALE Protein-coding sequence differences have failed to fully explain the evolution of multiple mammalian phenotypes. This suggests that these phenotypes have evolved at least in part through changes in gene expression, meaning that their differences across species may be caused by differences in genome sequence at enhancer regions that control gene expression in specific tissues and cell types. Yet the enhancers involved in phenotype evolution are largely unknown. Sequence conservation–based approaches for identifying such enhancers are limited because enhancer activity can be conserved even when the individual nucleotides within the sequence are poorly conserved. This is due to an overwhelming number of cases where nucleotides turn over at a high rate, but a similar combination of transcription factor binding sites and other sequence features can be maintained across millions of years of evolution, allowing the function of the enhancer to be conserved in a particular cell type or tissue. Experimentally measuring the function of orthologous enhancers across dozens of species is currently infeasible, but new machine learning methods make it possible to make reliable sequence-based predictions of enhancer function across species in specific tissues and cell types. RESULTS To overcome the limits of studying individual nucleotides, we developed the Tissue-Aware Conservation Inference Toolkit (TACIT). Rather than measuring the extent to which individual nucleotides are conserved across a region, TACIT uses machine learning to test whether the function of a given part of the genome is likely to be conserved. More specifically, convolutional neural networks learn the tissue- or cell type–specific regulatory code connecting genome sequence to enhancer activity using candidate enhancers identified from only a few species. This approach allows us to accurately associate differences between species in tissue or cell type–specific enhancer activity with genome sequence differences at enhancer orthologs. We then connect these predictions of enhancer function to phenotypes across hundreds of mammals in a way that accounts for species’ phylogenetic relatedness. We applied TACIT to identify candidate enhancers from motor cortex and parvalbumin neuron open chromatin data that are associated with brain size relative to body size, solitary living, and vocal learning across 222 mammals. Our results include the identification of multiple candidate enhancers associated with brain size relative to body size, several of which are located in linear or three-dimensional proximity to genes whose protein-coding mutations have been implicated in microcephaly or macrocephaly in humans. We also identified candidate enhancers associated with the evolution of solitary living near a gene implicated in separation anxiety and other enhancers associated with the evolution of vocal learning ability. We obtained distinct results for bulk motor cortex and parvalbumin neurons, demonstrating the value in applying TACIT to both bulk tissue and specific minority cell type populations. To facilitate future analyses of our results and applications of TACIT, we released predicted enhancer activity of >400,000 candidate enhancers in each of 222 mammals and their associations with the phenotypes we investigated. CONCLUSION TACIT leverages predicted enhancer activity conservation rather than nucleotide-level conservation to connect genetic sequence differences between species to phenotypes across large numbers of mammals. TACIT can be applied to any phenotype with enhancer activity data available from at least a few species in a relevant tissue or cell type and a whole-genome alignment available across dozens of species with substantial phenotypic variation. Although we developed TACIT for transcriptional enhancers, it could also be applied to genomic regions involved in other components of gene regulation, such as promoters and splicing enhancers and silencers. As the number of sequenced genomes grows, machine learning approaches such as TACIT have the potential to help make sense of how conservation of, or changes in, subtle genome patterns can help explain phenotype evolution. Tissue-Aware Conservation Inference Toolkit (TACIT) associates genetic differences between species with phenotypes. TACIT works by generating open chromatin data from a few species in a tissue related to a phenotype, using the sequences underlying open and closed chromatin regions to train a machine learning model for predicting tissue-specific open chromatin and associating open chromatin predictions across dozens of mammals with the phenotype. [Species silhouettes are from PhyloPic] 
    more » « less
  4. Abstract

    High‐throughput single‐cell cytometry technologies have significantly improved our understanding of cellular phenotypes to support translational research and the clinical diagnosis of hematological and immunological diseases. However, subjective and ad hoc manual gating analysis does not adequately handle the increasing volume and heterogeneity of cytometry data for optimal diagnosis. Prior work has shown that machine learning can be applied to classify cytometry samples effectively. However, many of the machine learning classification results are either difficult to interpret without using characteristics of cell populations to make the classification, or suboptimal due to the use of inaccurate cell population characteristics derived from gating boundaries. To date, little has been done to optimize both the gating boundaries and the diagnostic accuracy simultaneously. In this work, we describe a fully discriminative machine learning approach that can simultaneously learn feature representations (e.g., combinations of coordinates of gating boundaries) and classifier parameters for optimizing clinical diagnosis from cytometry measurements. The approach starts from an initial gating position and then refines the position of the gating boundaries by gradient descent until a set of globally‐optimized gates across different samples are achieved. The learning procedure is constrained by regularization terms encoding domain knowledge that encourage the algorithm to seek interpretable results. We evaluate the proposed approach using both simulated and real data, producing classification results on par with those generated via human expertise, in terms of both the positions of the gating boundaries and the diagnostic accuracy. © 2019 The Authors.Cytometry Part Apublished by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.

    more » « less
  5. Optical coherence tomography (OCT) leverages light scattering by biological tissues as endogenous contrast to form structural images. Light scattering behavior is dictated by the optical properties of the tissue, which depend on microstructural details at the cellular or sub-cellular level. Methods to measure these properties from OCT intensity data have been explored in the context of a number of biomedical applications seeking to access this sub-resolution tissue microstructure and thereby increase the diagnostic impact of OCT. Most commonly, the optical attenuation coefficient, an analogue of the scattering coefficient, has been used as a surrogate metric linking OCT intensity to subcellular particle characteristics. To record attenuation coefficient data that is accurately representative of the underlying physical properties of a given sample, it is necessary to account for the impact of the OCT imaging system itself on the distribution of light intensity in the sample, including the numerical aperture (NA) of the system and the location of the focal plane with respect to the sample surface, as well as the potential contribution of multiple scattering to the reconstructed intensity signal. Although these considerations complicate attenuation coefficient measurement and interpretation, a suitably calibrated system may potentiate a powerful strategy for gaining additional information about the scattering behavior and microstructure of samples. In this work, we experimentally show that altering the OCT system geometry minimally impacts measured attenuation coefficients in samples presumed to be singly scattering, but changes these measurements in more highly scattering samples. Using both depth-resolved attenuation coefficient data and layer-resolved backscattering coefficients, we demonstrate the retrieval of scattering particle diameter and concentration in tissue-mimicking phantoms, and the impact of presumed multiple scattering on these calculations. We further extend our approach to characterize a murine brain tissue sample and highlight a tumor-bearing region based on increased scattering particle density. Through these methods, we not only enhance conventional OCT attenuation coefficient analysis by decoupling the independent effects of particle size and concentration, but also discriminate areas of strong multiple scattering through minor changes to system topology to provide a framework for assessing the accuracy of these measurements.

    more » « less