skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Label free identification of different cancer cells using deep learning-based image analysis
Cancer diagnostics is an important field of cancer recovery and survival with many expensive procedures needed to administer the correct treatment. Machine Learning (ML) approaches can help with the diagnostic prediction from circulating tumor cells in liquid biopsy or from a primary tumor in solid biopsy. After predicting the metastatic potential from a deep learning model, doctors in a clinical setting can administer a safe and correct treatment for a specific patient. This paper investigates the use of deep convolutional neural networks for predicting a specific cancer cell line as a tool for label free identification. Specifically, deep learning strategies for weight initialization and performance metrics are described, with transfer learning and the accuracy metric utilized in this work. The equipment used for prediction involves brightfield microscopy without the use of chemical labels, advanced instruments, or time-consuming biological techniques, giving an advantage over current diagnostic methods. In the procedure, three different binary datasets of well-known cancer cell lines were collected, each having a difference in metastatic potential. Two different classification models were adopted (EfficientNetV2 and ResNet-50) with the analysis given for each stage in the ML architecture. The training results for each model and dataset are provided and systematically compared. We found that the test set accuracy showed favorable performance for both ML models with EfficientNetV2 accuracy reaching up to 99%. These test results allowed EfficientNetV2 to outperform ResNet-50 at an average percent increase of 3.5% for each dataset. The high accuracy obtained from the predictions demonstrates that the system can be retrained on a large-scale clinical dataset.  more » « less
Award ID(s):
1935792
PAR ID:
10588270
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
American Institute of Physics
Date Published:
Journal Name:
APL Machine Learning
Volume:
1
Issue:
2
ISSN:
2770-9019
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Development of an assay to predict response to chemotherapy has remained an elusive goal in cancer research. We report a phenotypic chemosensitivity assay for epithelial ovarian cancer based on Doppler spectroscopy of infrared light scattered from intracellular motions in living three-dimensional tumor biopsy tissue measured in vitro. The study analyzed biospecimens from 20 human patients with epithelial ovarian cancer. Matched primary and metastatic tumor tissues were collected for 3 patients, and an additional 3 patients provided only metastatic tissues. Doppler fluctuation spectra were obtained using full-field optical coherence tomography through off-axis digital holography. Frequencies in the range from 10 mHz to 10 Hz are sensitive to changes in intracellular dynamics caused by platinum-based chemotherapy. Metastatic tumor tissues were found to display a biodynamic phenotype that was similar to primary tissue from patients who had poor clinical outcomes. The biodynamic phenotypic profile correctly classified 90% [88–91% c.i.] of the patients when the metastatic samples were characterized as having a chemoresistant phenotype. This work suggests that Doppler profiling of tissue response to chemotherapy has the potential to predict patient clinical outcomes based on primary, but not metastatic, tumor tissue. 
    more » « less
  2. Automated region of interest detection in histopathological image analysis is a challenging and important topic with tremendous potential impact on clinical practice. The deep learning methods used in computational pathology may help us to reduce costs and increase the speed and accuracy of cancer diagnosis. We started with the UNC Melanocytic Tumor Dataset cohort which contains 160 hematoxylin and eosin whole slide images of primary melanoma (86) and nevi (74). We randomly assigned 80% (134) as a training set and built an in-house deep learning method to allow for classification, at the slide level, of nevi and melanoma. The proposed method performed well on the other 20% (26) test dataset; the accuracy of the slide classification task was 92.3% and our model also performed well in terms of predicting the region of interest annotated by the pathologists, showing excellent performance of our model on melanocytic skin tumors. Even though we tested the experiments on a skin tumor dataset, our work could also be extended to other medical image detection problems to benefit the clinical evaluation and diagnosis of different tumors. 
    more » « less
  3. Surgical pathology reports contain essential diagnostic information, in free-text form, required for cancer staging, treatment planning, and cancer registry documentation. However, their unstructured nature and variability across tumor types and institutions pose challenges for automated data extraction. We present a consensus-driven, reasoning-based framework that uses multiple locally deployed large language models (LLMs) to extract six key diagnostic variables: site, laterality, histology, stage, grade, and behavior. Each LLM produces structured outputs with accompanying justifications, which are evaluated for accuracy and coherence by a separate reasoning model. Final consensus values are determined through aggregation, and expert validation is conducted by board-certified or equivalent pathologists. The framework was applied to over 4,000 pathology reports from The Cancer Genome Atlas (TCGA) and Moffitt Cancer Center. Expert review confirmed high agreement in the TCGA dataset for behavior (100.0%), histology (98.5%), site (95.2%), and grade (95.6%), with lower performance for stage (87.6%) and laterality (84.8%). In the pathology reports from Moffitt (brain, breast, and lung), accuracy remained high across variables, with histology (95.6%), behavior (98.3%), and stage (92.4%), achieving strong agreement. However, certain challenges emerged, such as inconsistent mention of sentinel lymph node details or anatomical ambiguity in biopsy site interpretations. Statistical analyses revealed significant main effects of model type, variable, and organ system, as well as model × variable × organ interactions, emphasizing the role of clinical context in model performance. These results highlight the importance of stratified, multi-organ evaluation frameworks in LLM benchmarking for clinical applications. Textual justifications enhanced interpretability and enabled human reviewers to audit model outputs. Overall, this consensus-based approach demonstrates that locally deployed LLMs can provide a transparent, accurate, and auditable solution for integrating AI-driven data extraction into real-world pathology workflows, including cancer registry abstraction and synoptic reporting. 
    more » « less
  4. Breast cancer is the leading cancer affecting women globally. Despite deep learning models making significant strides in diagnosing and treating this disease, ensuring fair outcomes across diverse populations presents a challenge, particularly when certain demographic groups are underrepresented in training datasets. Addressing the fairness of AI models across varied demographic backgrounds is crucial. This study analyzes demographic representation within the publicly accessible Emory Breast Imaging Dataset (EMBED), which includes de-identified mammography and clinical data. We spotlight the data disparities among racial and ethnic groups and assess the biases in mammography image classification models trained on this dataset, specifically ResNet-50 and Swin Transformer V2. Our evaluation of classification accuracies across these groups reveals significant variations in model performance, highlighting concerns regarding the fairness of AI diagnostic tools. This paper emphasizes the imperative need for fairness in AI and suggests directions for future research aimed at increasing the inclusiveness and dependability of these technologies in healthcare settings. Code is available at: https://github.com/kuanhuang0624/EMBEDFairModels. 
    more » « less
  5. Traditional drug screening models are often unable to faithfully recapitulate human physiology in health and disease, motivating the development of microfluidic organs-on-a-chip (OOC) platforms that can mimic many aspects of human physiology and in the process alleviate many of the discrepancies between preclinical studies and clinical trials outcomes. Linsitinib, a novel anti-cancer drug, showed promising results in pre-clinical models of Ewing Sarcoma (ES), where it suppressed tumor growth. However, a Phase II clinical trial in several European centers with patients showed relapsed and/or refractory ES. We report an integrated, open setting, imaging and sampling accessible, polysulfone-based platform, featuring minimal hydrophobic compound binding. Two bioengineered human tissues – bone ES tumor and heart muscle – were cultured either in isolation or in the integrated platform and subjected to a clinically used linsitinib dosage. The measured anti-tumor efficacy and cardiotoxicity were compared with the results observed in the clinical trial. Only the engineered tumor tissues, and not monolayers, recapitulated the bone microenvironment pathways targeted by linsitinib, and the clinically-relevant differences in drug responses between non-metastatic and metastatic ES tumors. The responses of non-metastatic ES tumor tissues and heart muscle to linsitinib were much closer to those observed in the clinical trial for tissues cultured in an integrated setting than for tissues cultured in isolation. Drug treatment of isolated tissues resulted in significant decreases in tumor viability and cardiac function. Meanwhile, drug treatment in an integrated setting showed poor tumor response and less cardiotoxicity, which matched the results of the clinical trial. Overall, the integration of engineered human tumor and cardiac tissues in the integrated platform improved the predictive accuracy for both the direct and off-target effects of linsitinib. The proposed approach could be readily extended to other drugs and tissue systems. 
    more » « less