Abstract Purpose. This study aims to develop and validate a multi-view learning method by the combination of primary tumor radiomics and lymph node (LN) radiomics for the preoperative prediction of LN status in gastric cancer (GC). Methods. A total of 170 contrast-enhanced abdominal CT images from GC patients were enrolled in this retrospective study. After data preprocessing, two-step feature selection approach including Pearson correlation analysis and supervised feature selection method based on test-time budget (FSBudget) was performed to remove redundance of tumor and LN radiomics features respectively. Two types of discriminative features were then learned by an unsupervised multi-view partial least squares (UMvPLS) for a latent common space on which a logistic regression classifier is trained. Five repeated random hold-out experiments were employed. Results. On 20-dimensional latent common space, area under receiver operating characteristic curve (AUC), precision, accuracy, recall and F1-score are 0.9531 ± 0.0183, 0.9260 ± 0.0184, 0.9136 ± 0.0174, 0.9468 ± 0.0106 and 0.9362 ± 0.0125 for the training cohort respectively, and 0.8984 ± 0.0536, 0.8671 ± 0.0489, 0.8500 ± 0.0599, 0.9118 ± 0.0550 and 0.8882 ± 0.0440 for the validation cohort respectively (reported as mean ± standard deviation). It shows a better discrimination capability than single-view methods, our previous method, and eight baseline methods. When the dimension was reduced to 2, the model not only has effective prediction performance, but also is convenient for data visualization. Conclusions. Our proposed method by integrating radiomics features of primary tumor and LN can be helpful in predicting lymph node metastasis in patients of GC. It shows multi-view learning has great potential for guiding the prognosis and treatment decision-making in GC.
more »
« less
A Neoteric Feature Extraction Technique to Predict the Survival of Gastric Cancer Patients
Background: At the time of cancer diagnosis, it is crucial to accurately classify malignant gastric tumors and the possibility that patients will survive. Objective: This study aims to investigate the feasibility of identifying and applying a new feature extraction technique to predict the survival of gastric cancer patients. Methods: A retrospective dataset including the computed tomography (CT) images of 135 patients was assembled. Among them, 68 patients survived longer than three years. Several sets of radiomics features were extracted and were incorporated into a machine learning model, and their classification performance was characterized. To improve the classification performance, we further extracted another 27 texture and roughness parameters with 2484 superficial and spatial features to propose a new feature pool. This new feature set was added into the machine learning model and its performance was analyzed. To determine the best model for our experiment, Random Forest (RF) classifier, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Naïve Bayes (NB) (four of the most popular machine learning models) were utilized. The models were trained and tested using the five-fold cross-validation method. Results: Using the area under ROC curve (AUC) as an evaluation index, the model that was generated using the new feature pool yields AUC = 0.98 ± 0.01, which was significantly higher than the models created using the traditional radiomics feature set (p < 0.04). RF classifier performed better than the other machine learning models. Conclusions: This study demonstrated that although radiomics features produced good classification performance, creating new feature sets significantly improved the model performance.
more »
« less
- PAR ID:
- 10579966
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Diagnostics
- Volume:
- 14
- Issue:
- 9
- ISSN:
- 2075-4418
- Page Range / eLocation ID:
- 954
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Objective Sudden unexpected death in epilepsy (SUDEP) is the leading cause of epilepsy-related mortality. Although lots of effort has been made in identifying clinical risk factors for SUDEP in the literature, there are few validated methods to predict individual SUDEP risk. Prolonged postictal EEG suppression (PGES) is a potential SUDEP biomarker, but its occurrence is infrequent and requires epilepsy monitoring unit admission. We use machine learning methods to examine SUDEP risk using interictal EEG and ECG recordings from SUDEP cases and matched living epilepsy controls. Methods This multicenter, retrospective, cohort study examined interictal EEG and ECG recordings from 30 SUDEP cases and 58 age-matched living epilepsy patient controls. We trained machine learning models with interictal EEG and ECG features to predict the retrospective SUDEP risk for each patient. We assessed cross-validated classification accuracy and the area under the receiver operating characteristic (AUC) curve. Results The logistic regression (LR) classifier produced the overall best performance, outperforming the support vector machine (SVM), random forest (RF), and convolutional neural network (CNN). Among the 30 patients with SUDEP [14 females; mean age (SD), 31 (8.47) years] and 58 living epilepsy controls [26 females (43%); mean age (SD) 31 (8.5) years], the LR model achieved the median AUC of 0.77 [interquartile range (IQR), 0.73–0.80] in five-fold cross-validation using interictal alpha and low gamma power ratio of the EEG and heart rate variability (HRV) features extracted from the ECG. The LR model achieved the mean AUC of 0.79 in leave-one-center-out prediction. Conclusions Our results support that machine learning-driven models may quantify SUDEP risk for epilepsy patients, future refinements in our model may help predict individualized SUDEP risk and help clinicians correlate predictive scores with the clinical data. Low-cost and noninvasive interictal biomarkers of SUDEP risk may help clinicians to identify high-risk patients and initiate preventive strategies.more » « less
-
Abstract Prostate cancer treatment decisions rely heavily on subjective visual interpretation [assigning Gleason patterns or International Society of Urological Pathology (ISUP) grade groups] of limited numbers of two‐dimensional (2D) histology sections. Under this paradigm, interobserver variance is high, with ISUP grades not correlating well with outcome for individual patients, and this contributes to the over‐ and undertreatment of patients. Recent studies have demonstrated improved prognostication of prostate cancer outcomes based on computational analyses of glands and nuclei within 2D whole slide images. Our group has also shown that the computational analysis of three‐dimensional (3D) glandular features, extracted from 3D pathology datasets of whole intact biopsies, can allow for improved recurrence prediction compared to corresponding 2D features. Here we seek to expand on these prior studies by exploring the prognostic value of 3D shape‐based nuclear features in prostate cancer (e.g. nuclear size, sphericity). 3D pathology datasets were generated using open‐top light‐sheet (OTLS) microscopy of 102 cancer‐containing biopsies extractedex vivofrom the prostatectomy specimens of 46 patients. A deep learning‐based workflow was developed for 3D nuclear segmentation within the glandular epithelium versus stromal regions of the biopsies. 3D shape‐based nuclear features were extracted, and a nested cross‐validation scheme was used to train a supervised machine classifier based on 5‐year biochemical recurrence (BCR) outcomes. Nuclear features of the glandular epithelium were found to be more prognostic than stromal cell nuclear features (area under the ROC curve [AUC] = 0.72 versus 0.63). 3D shape‐based nuclear features of the glandular epithelium were also more strongly associated with the risk of BCR than analogous 2D features (AUC = 0.72 versus 0.62). The results of this preliminary investigation suggest that 3D shape‐based nuclear features are associated with prostate cancer aggressiveness and could be of value for the development of decision‐support tools. © 2023 The Pathological Society of Great Britain and Ireland.more » « less
-
Abstract BackgroundDelta radiomics is a high‐throughput computational technique used to describe quantitative changes in serial, time‐series imaging by considering the relative change in radiomic features of images extracted at two distinct time points. Recent work has demonstrated a lack of prognostic signal of radiomic features extracted using this technique. We hypothesize that this lack of signal is due to the fundamental assumptions made when extracting features via delta radiomics, and that other methods should be investigated. PurposeThe purpose of this work was to show a proof‐of‐concept of a new radiomics paradigm for sparse, time‐series imaging data, where features are extracted from a spatial‐temporal manifold modeling the time evolution between images, and to assess the prognostic value on patients with oropharyngeal cancer (OPC). MethodsTo accomplish this, we developed an algorithm to mathematically describe the relationship between two images acquired at time and . These images serve as boundary conditions of a partial differential equation describing the transition from one image to the other. To solve this equation, we propagate the position and momentum of each voxel according to Fokker–Planck dynamics (i.e., a technique common in statistical mechanics). This transformation is driven by an underlying potential force uniquely determined by the equilibrium image. The solution generates a spatial‐temporal manifold (3 spatial dimensions + time) from which we define dynamic radiomic features. First, our approach was numerically verified by stochastically sampling dynamic Gaussian processes of monotonically decreasing noise. The transformation from high to low noise was compared between our Fokker–Planck estimation and simulated ground‐truth. To demonstrate feasibility and clinical impact, we applied our approach to18F‐FDG‐PET images to estimate early metabolic response of patients (n = 57) undergoing definitive (chemo)radiation for OPC. Images were acquired pre‐treatment and 2‐weeks intra‐treatment (after 20 Gy). Dynamic radiomic features capturing changes in texture and morphology were then extracted. Patients were partitioned into two groups based on similar dynamic radiomic feature expression via k‐means clustering and compared by Kaplan–Meier analyses with log‐rank tests (p < 0.05). These results were compared to conventional delta radiomics to test the added value of our approach. ResultsNumerical results confirmed our technique can recover image noise characteristics given sparse input data as boundary conditions. Our technique was able to model tumor shrinkage and metabolic response. While no delta radiomics features proved prognostic, Kaplan–Meier analyses identified nine significant dynamic radiomic features. The most significant feature was Gray‐Level‐Size‐Zone‐Matrix gray‐level variance (p = 0.011), which demonstrated prognostic improvement over its corresponding delta radiomic feature (p = 0.722). ConclusionsWe developed, verified, and demonstrated the prognostic value of a novel, physics‐based radiomics approach over conventional delta radiomics via data assimilation of quantitative imaging and differential equations.more » « less
-
Background and Objectives: Sepsis is a leading cause of mortality in intensive care units (ICUs). The development of a robust prognostic model utilizing patients’ clinical data could significantly enhance clinicians’ ability to make informed treatment decisions, potentially improving outcomes for septic patients. This study aims to create a novel machine-learning framework for constructing prognostic tools capable of predicting patient survival or mortality outcome. Methods: A novel dataset is created using concatenated triples of static data, temporal data, and clinical outcomes to expand data size. This structured input trains five machine learning classifiers (KNN, Logistic Regression, SVM, RF, and XGBoost) with advanced feature engineering. Models are evaluated on an independent cohort using AUROC and a new metric, 𝛾, which incorporates the F1 score, to assess discriminative power and generalizability. Results: We developed five prognostic models using the concatenated triple dataset with 10 dynamic features from patient medical records. Our analysis shows that the Extreme Gradient Boosting (XGBoost) model (AUROC = 0.777, F1 score = 0.694) and the Random Forest (RF) model (AUROC = 0.769, F1 score = 0.647), when paired with an ensemble under-sampling strategy, outperform other models. The RF model improves AUROC by 6.66% and reduces overfitting by 54.96%, while the XGBoost model shows a 0.52% increase in AUROC and a 77.72% reduction in overfitting. These results highlight our framework’s ability to enhance predictive accuracy and generalizability, particularly in sepsis prognosis. Conclusion: This study presents a novel modeling framework for predicting treatment outcomes in septic patients, designed for small, imbalanced, and high-dimensional datasets. By using temporal feature encoding, advanced sampling, and dimension reduction techniques, our approach enhances standard classifier performance. The resulting models show improved accuracy with limited data, offering valuable prognostic tools for sepsis management. This framework demonstrates the potential of machine learning in small medical datasets.more » « less