skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Morphological profiling for drug discovery in the era of deep learning
Abstract Morphological profiling is a valuable tool in phenotypic drug discovery. The advent of high-throughput automated imaging has enabled the capturing of a wide range of morphological features of cells or organisms in response to perturbations at the single-cell resolution. Concurrently, significant advances in machine learning and deep learning, especially in computer vision, have led to substantial improvements in analyzing large-scale high-content images at high throughput. These efforts have facilitated understanding of compound mechanism of action, drug repurposing, characterization of cell morphodynamics under perturbation, and ultimately contributing to the development of novel therapeutics. In this review, we provide a comprehensive overview of the recent advances in the field of morphological profiling. We summarize the image profiling analysis workflow, survey a broad spectrum of analysis strategies encompassing feature engineering– and deep learning–based approaches, and introduce publicly available benchmark datasets. We place a particular emphasis on the application of deep learning in this pipeline, covering cell segmentation, image representation learning, and multimodal learning. Additionally, we illuminate the application of morphological profiling in phenotypic drug discovery and highlight potential challenges and opportunities in this field.  more » « less
Award ID(s):
2111679
PAR ID:
10546368
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Oxford Academic
Date Published:
Journal Name:
Briefings in Bioinformatics
Volume:
25
Issue:
4
ISSN:
1467-5463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Measuring the phenotypic effect of treatments on cells through imaging assays is an efficient and powerful way of studying cell biology, and requires computational methods for transforming images into quantitative data. Here, we present an improved strategy for learning representations of treatment effects from high-throughput imaging, following a causal interpretation. We use weakly supervised learning for modeling associations between images and treatments, and show that it encodes both confounding factors and phenotypic features in the learned representation. To facilitate their separation, we constructed a large training dataset with images from five different studies to maximize experimental diversity, following insights from our causal analysis. Training a model with this dataset successfully improves downstream performance, and produces a reusable convolutional network for image-based profiling, which we call Cell Painting CNN. We evaluated our strategy on three publicly available Cell Painting datasets, and observed that the Cell Painting CNN improves performance in downstream analysis up to 30% with respect to classical features, while also being more computationally efficient. 
    more » « less
  2. Abstract Predicting assay results for compounds virtually using chemical structures and phenotypic profiles has the potential to reduce the time and resources of screens for drug discovery. Here, we evaluate the relative strength of three high-throughput data sources—chemical structures, imaging (Cell Painting), and gene-expression profiles (L1000)—to predict compound bioactivity using a historical collection of 16,170 compounds tested in 270 assays for a total of 585,439 readouts. All three data modalities can predict compound activity for 6–10% of assays, and in combination they predict 21% of assays with high accuracy, which is a 2 to 3 times higher success rate than using a single modality alone. In practice, the accuracy of predictors could be lower and still be useful, increasing the assays that can be predicted from 37% with chemical structures alone up to 64% when combined with phenotypic data. Our study shows that unbiased phenotypic profiling can be leveraged to enhance compound bioactivity prediction to accelerate the early stages of the drug-discovery process. 
    more » « less
  3. Three-dimensional (3D) tumor spheroid models have gained increased recognition as important tools in cancer research and anti-cancer drug development. However, currently available imaging approaches employed in high-throughput screening drug discovery platforms e.g. bright field, phase contrast, and fluorescence microscopies, are unable to resolve 3D structures deep inside (>50 μm) tumor spheroids. In this study, we established a label-free, non-invasive optical coherence tomography (OCT) imaging platform to characterize 3D morphological and physiological information of multicellular tumor spheroids (MCTS) growing from ~250 μm up to ~600 μm in height over 21 days. In particular, tumor spheroids of two cell lines glioblastoma (U-87 MG) and colorectal carcinoma (HCT 116) exhibited distinctive evolutions in their geometric shapes at late growth stages. Volumes of MCTS were accurately quantified using a voxel-based approach without presumptions of their geometries. In contrast, conventional diameter-based volume calculations assuming perfect spherical shape resulted in large quantification errors. Furthermore, we successfully detected necrotic regions within these tumor spheroids based on increased intrinsic optical attenuation, suggesting a promising alternative of label-free viability tests in tumor spheroids. Therefore, OCT can serve as a promising imaging modality to characterize morphological and physiological features of MCTS, showing great potential for high-throughput drug screening. 
    more » « less
  4. null (Ed.)
    The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed. 
    more » « less
  5. Phenomics requires quantification of large volumes of image data, necessitating high throughput image processing approaches. Existing image processing pipelines for Drosophila wings, a powerful genetic model for studying the underlying genetics for a broad range of cellular and developmental processes, are limited in speed, precision, and functional versatility. To expand on the utility of the wing as a phenotypic screening system, we developed MAPPER, an automated machine learning-based pipeline that quantifies high-dimensional phenotypic signatures, with each dimension quantifying a unique morphological feature of the Drosophila wing. MAPPER magnifies the power of Drosophila phenomics by rapidly quantifying subtle phenotypic differences in sample populations. We benchmarked MAPPER’s accuracy and precision in replicating manual measurements to demonstrate its widespread utility. The morphological features extracted using MAPPER reveal variable sexual dimorphism across Drosophila species and unique underlying sex-specific differences in morphogen signaling in male and female wings. Moreover, the length of the proximal-distal axis across the species and sexes shows a conserved scaling relationship with respect to the wing size. In sum, MAPPER is an open-source tool for rapid, high-dimensional analysis of large imaging datasets. These high-content phenomic capabilities enable rigorous and systematic identification of genotype-to-phenotype relationships in a broad range of screening and drug testing applications and amplify the potential power of multimodal genomic approaches. 
    more » « less