Decision forest, including RandomForest, XGBoost, and Light-GBM, dominates the machine learning tasks over tabular data. Recently, several frameworks were developed for decision forest inference, such as ONNX, TreeLite from Amazon, TensorFlow Decision Forest from Google, HummingBirdfrom Microsoft, Nvidia FIL, and lleaves. While these frameworks are fully optimized for inference computations, they are all decoupled with databases and general data management frameworks, which leads to cross-system performance overheads. We first provided a DICT model to understand the performance gaps between decoupled and in-database inference. We further identified that for in-database inference, in addition to the popular UDF-centric representation that encapsulates the ML into one User Defined Function(UDF), there also exists a relation-centric representation that breaks down the decision forest inference into several fine-grained SQL operations. The relation-centric representation can achieve significantly better performance for large models. We optimized both implementations and conducted a comprehensive benchmark to compare these two implementations to the aforementioned decoupled inference pipelines and existing in-database inference pipelines such as Spark-SQL and PostgresML. The evaluation results validated the DICT model and demonstrated the superior performance of our in-database inference design compared to the baselines. 
                        more » 
                        « less   
                    
                            
                            End-to-End Evidential-Efficient Net for Radiomics Analysis of Brain MRI to Predict Oncogene Expression and Overall Survival
                        
                    
    
            We presented a novel radiomics approach using multimodality MRI to predict the expression of an oncogene (O6-Methylguanine-DNA methyltransferase, MGMT) and overall survival (OS) of glioblastoma (GBM) patients. Specifically, we employed an EffNetV2-T, which was down scaled and modified from EfficientNetV2, as the feature extractor. Besides, we used evidential layers based to control the distribution of prediction outputs. The evidential layers help to classify the high-dimensional radiomics features to predict the methylation status of MGMT and OS. Tests showed that our model achieved an accuracy of 0.844, making it possible to use as a clinic-enabling technique in the diagnosing and management of GBM. Comparison results indicated that our method performed better than existing work. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2115095
- PAR ID:
- 10496627
- Editor(s):
- Wang, Linwei; Dou, Qi; Fletcher, P. Thomas; Speidel, Stefanie; Li, Shuo
- Publisher / Repository:
- Springer Nature Switzerland
- Date Published:
- Journal Name:
- Medical Image Computing and Computer Assisted Intervention -- MICCAI 2022
- ISBN:
- 978-3-031-16437-8
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Background: At the time of cancer diagnosis, it is crucial to accurately classify malignant gastric tumors and the possibility that patients will survive. Objective: This study aims to investigate the feasibility of identifying and applying a new feature extraction technique to predict the survival of gastric cancer patients. Methods: A retrospective dataset including the computed tomography (CT) images of 135 patients was assembled. Among them, 68 patients survived longer than three years. Several sets of radiomics features were extracted and were incorporated into a machine learning model, and their classification performance was characterized. To improve the classification performance, we further extracted another 27 texture and roughness parameters with 2484 superficial and spatial features to propose a new feature pool. This new feature set was added into the machine learning model and its performance was analyzed. To determine the best model for our experiment, Random Forest (RF) classifier, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Naïve Bayes (NB) (four of the most popular machine learning models) were utilized. The models were trained and tested using the five-fold cross-validation method. Results: Using the area under ROC curve (AUC) as an evaluation index, the model that was generated using the new feature pool yields AUC = 0.98 ± 0.01, which was significantly higher than the models created using the traditional radiomics feature set (p < 0.04). RF classifier performed better than the other machine learning models. Conclusions: This study demonstrated that although radiomics features produced good classification performance, creating new feature sets significantly improved the model performance.more » « less
- 
            Abstract Background Glioblastoma Multiforme (GBM) is a fast-growing and highly aggressive brain tumor that invades the nearby brain tissue and presents secondary nodular lesions across the whole brain but generally does not spread to distant organs. Without treatment, GBM can result in death in about 6 months. The challenges are known to depend on multiple factors: brain localization, resistance to conventional therapy, disrupted tumor blood supply inhibiting effective drug delivery, complications from peritumoral edema, intracranial hypertension, seizures, and neurotoxicity. Main text Imaging techniques are routinely used to obtain accurate detections of lesions that localize brain tumors. Especially magnetic resonance imaging (MRI) delivers multimodal images both before and after the administration of contrast, which results in displaying enhancement and describing physiological features as hemodynamic processes. This review considers one possible extension of the use of radiomics in GBM studies, one that recalibrates the analysis of targeted segmentations to the whole organ scale. After identifying critical areas of research, the focus is on illustrating the potential utility of an integrated approach with multimodal imaging, radiomic data processing and brain atlases as the main components. The templates associated with the outcome of straightforward analyses represent promising inference tools able to spatio-temporally inform on the GBM evolution while being generalizable also to other cancers. Conclusions The focus on novel inference strategies applicable to complex cancer systems and based on building radiomic models from multimodal imaging data can be well supported by machine learning and other computational tools potentially able to translate suitably processed information into more accurate patient stratifications and evaluations of treatment efficacy. Graphical Abstractmore » « less
- 
            Proteinuria, the presence of high molecular weight proteins in the urine, is a primary indicator of chronic kidney disease. Proteinuria results from increased molecular permeability of the glomerular filtration barrier combined with saturation or defects in tubular protein reabsorption. Any solute that passes into the glomerular filtrate traverses the glomerular endothelium, the glomerular basement membrane, and the podocyte slit diaphragm. Damage to any layer of the filter has reciprocal effects on other layers to increase glomerular permeability. The GBM is thought to act as a compressible ultrafilter that has increased molecular selectivity with increased pressure due to compression that reduced the porosity of the GBM with increased pressure. In multiple forms of chronic kidney disease, crosslinking enzymes are upregulated and may act to increase GBM stiffness. Here we show that enzymatically crosslinking porcine GBM with transglutaminase increases the stiffness of the GBM and mitigates pressure-dependent reductions in molecular sieving coefficient. This was modeled mathematically using a modified membrane transport model accounting for GBM compression. Changes in the mechanical properties of the GBM may contribute to proteinuria through pressure-dependent effects on GBM porosity.more » « less
- 
            Abstract BackgroundGlioblastoma Multiforme, an aggressive primary brain tumor, has a poor prognosis and no effective standard of care treatments. Most patients undergoing radiotherapy, along with Temozolomide chemotherapy, develop resistance to the drug, and recurrence of the tumor is a common issue after the treatment. We propose to model the pathways active in Glioblastoma using Boolean network techniques. The network captures the genetic interactions and possible mutations that are involved in the development of the brain tumor. The model is used to predict the theoretical efficacies of drugs for the treatment of cancer. ResultsWe use the Boolean network to rank the critical intervention points in the pathway to predict an effective therapeutic strategy for Glioblastoma. Drug repurposing helps to identify non-cancer drugs that could be effective in cancer treatment. We predict the effectiveness of drug combinations of anti-cancer and non-cancer drugs for Glioblastoma. ConclusionsGiven the genetic profile of a GBM tumor, the Boolean model can predict the most effective targets for treatment. We also identified two-drug combinations that could be more effective in killing GBM cells than conventional chemotherapeutic agents. The non-cancer drug Aspirin could potentially increase the cytotoxicity of TMZ in GBM patients.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    