skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Towards optimal deep fusion of imaging and clinical data via a model‐based description of fusion quality
Abstract BackgroundDue to intrinsic differences in data formatting, data structure, and underlying semantic information, the integration of imaging data with clinical data can be non‐trivial. Optimal integration requires robust data fusion, that is, the process of integrating multiple data sources to produce more useful information than captured by individual data sources. Here, we introduce the concept offusion qualityfor deep learning problems involving imaging and clinical data. We first provide a general theoretical framework and numerical validation of our technique. To demonstrate real‐world applicability, we then apply our technique to optimize the fusion of CT imaging and hepatic blood markers to estimate portal venous hypertension, which is linked to prognosis in patients with cirrhosis of the liver. PurposeTo develop a measurement method of optimal data fusion quality deep learning problems utilizing both imaging data and clinical data. MethodsOur approach is based on modeling the fully connected layer (FCL) of a convolutional neural network (CNN) as a potential function, whose distribution takes the form of the classical Gibbs measure. The features of the FCL are then modeled as random variables governed by state functions, which are interpreted as the different data sources to be fused. The probability density of each source, relative to the probability density of the FCL, represents a quantitative measure of source‐bias. To minimize this source‐bias and optimize CNN performance, we implement a vector‐growing encoding scheme called positional encoding, where low‐dimensional clinical data are transcribed into a rich feature space that complements high‐dimensional imaging features. We first provide a numerical validation of our approach based on simulated Gaussian processes. We then applied our approach to patient data, where we optimized the fusion of CT images with blood markers to predict portal venous hypertension in patients with cirrhosis of the liver. This patient study was based on a modified ResNet‐152 model that incorporates both images and blood markers as input. These two data sources were processed in parallel, fused into a single FCL, and optimized based on our fusion quality framework. ResultsNumerical validation of our approach confirmed that the probability density function of a fused feature space converges to a source‐specific probability density function when source data are improperly fused. Our numerical results demonstrate that this phenomenon can be quantified as a measure of fusion quality. On patient data, the fused model consisting of both imaging data and positionally encoded blood markers at the theoretically optimal fusion quality metric achieved an AUC of 0.74 and an accuracy of 0.71. This model was statistically better than the imaging‐only model (AUC = 0.60; accuracy = 0.62), the blood marker‐only model (AUC = 0.58; accuracy = 0.60), and a variety of purposely sub‐optimized fusion models (AUC = 0.61–0.70; accuracy = 0.58–0.69). ConclusionsWe introduced the concept of data fusion quality for multi‐source deep learning problems involving both imaging and clinical data. We provided a theoretical framework, numerical validation, and real‐world application in abdominal radiology. Our data suggests that CT imaging and hepatic blood markers provide complementary diagnostic information when appropriately fused.  more » « less
Award ID(s):
2106988
PAR ID:
10530687
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
American Association of Physicists in Medicine
Date Published:
Journal Name:
Medical Physics
Volume:
50
Issue:
6
ISSN:
0094-2405
Page Range / eLocation ID:
3526 to 3537
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background and Aim:Copper is an essential trace metal serving as a cofactor in innate immunity, metabolism, and iron transport. We hypothesize that copper deficiency may influence survival in patients with cirrhosis through these pathways. Methods:We performed a retrospective cohort study involving 183 consecutive patients with cirrhosis or portal hypertension. Copper from blood and liver tissues was measured using inductively coupled plasma mass spectrometry. Polar metabolites were measured using nuclear magnetic resonance spectroscopy. Copper deficiency was defined by serum or plasma copper below 80 µg/dL for women or 70 µg/dL for men. Results:The prevalence of copper deficiency was 17% (N=31). Copper deficiency was associated with younger age, race, zinc and selenium deficiency, and higher infection rates (42% vs. 20%,p=0.01). Serum copper correlated positively with albumin, ceruloplasmin, hepatic copper, and negatively with IL-1β. Levels of polar metabolites involved in amino acids catabolism, mitochondrial transport of fatty acids, and gut microbial metabolism differed significantly according to copper deficiency status. During a median follow-up of 396 days, mortality was 22.6% in patients with copper deficiency compared with 10.5% in patients without. Liver transplantation rates were similar (32% vs. 30%). Cause-specific competing risk analysis showed that copper deficiency was associated with a significantly higher risk of death before transplantation after adjusting for age, sex, MELD-Na, and Karnofsky score (HR: 3.40, 95% CI, 1.18–9.82,p=0.023). Conclusions:In advanced cirrhosis, copper deficiency is relatively common and is associated with an increased infection risk, a distinctive metabolic profile, and an increased risk of death before transplantation. 
    more » « less
  2. Abstract ObjectiveSNOMED CT is the largest clinical terminology worldwide. Quality assurance of SNOMED CT is of utmost importance to ensure that it provides accurate domain knowledge to various SNOMED CT-based applications. In this work, we introduce a deep learning-based approach to uncover missing is-a relations in SNOMED CT. Materials and MethodsOur focus is to identify missing is-a relations between concept-pairs exhibiting a containment pattern (ie, the set of words of one concept being a proper subset of that of the other concept). We use hierarchically related containment concept-pairs as positive instances and hierarchically unrelated containment concept-pairs as negative instances to train a model predicting whether an is-a relation exists between 2 concepts with containment pattern. The model is a binary classifier leveraging concept name features, hierarchical features, enriched lexical attribute features, and logical definition features. We introduce a cross-validation inspired approach to identify missing is-a relations among all hierarchically unrelated containment concept-pairs. ResultsWe trained and applied our model on the Clinical finding subhierarchy of SNOMED CT (September 2019 US edition). Our model (based on the validation sets) achieved a precision of 0.8164, recall of 0.8397, and F1 score of 0.8279. Applying the model to predict actual missing is-a relations, we obtained a total of 1661 potential candidates. Domain experts performed evaluation on randomly selected 230 samples and verified that 192 (83.48%) are valid. ConclusionsThe results showed that our deep learning approach is effective in uncovering missing is-a relations between containment concept-pairs in SNOMED CT. 
    more » « less
  3. Abstract BackgroundIdiopathic pulmonary fibrosis (IPF) is a progressive, irreversible, and usually fatal lung disease of unknown reasons, generally affecting the elderly population. Early diagnosis of IPF is crucial for triaging patients’ treatment planning into anti‐fibrotic treatment or treatments for other causes of pulmonary fibrosis. However, current IPF diagnosis workflow is complicated and time‐consuming, which involves collaborative efforts from radiologists, pathologists, and clinicians and it is largely subject to inter‐observer variability. PurposeThe purpose of this work is to develop a deep learning‐based automated system that can diagnose subjects with IPF among subjects with interstitial lung disease (ILD) using an axial chest computed tomography (CT) scan. This work can potentially enable timely diagnosis decisions and reduce inter‐observer variability. MethodsOur dataset contains CT scans from 349 IPF patients and 529 non‐IPF ILD patients. We used 80% of the dataset for training and validation purposes and 20% as the holdout test set. We proposed a two‐stage model: at stage one, we built a multi‐scale, domain knowledge‐guided attention model (MSGA) that encouraged the model to focus on specific areas of interest to enhance model explainability, including both high‐ and medium‐resolution attentions; at stage two, we collected the output from MSGA and constructed a random forest (RF) classifier for patient‐level diagnosis, to further boost model accuracy. RF classifier is utilized as a final decision stage since it is interpretable, computationally fast, and can handle correlated variables. Model utility was examined by (1) accuracy, represented by the area under the receiver operating characteristic curve (AUC) with standard deviation (SD), and (2) explainability, illustrated by the visual examination of the estimated attention maps which showed the important areas for model diagnostics. ResultsDuring the training and validation stage, we observe that when we provide no guidance from domain knowledge, the IPF diagnosis model reaches acceptable performance (AUC±SD = 0.93±0.07), but lacks explainability; when including only guided high‐ or medium‐resolution attention, the learned attention maps are not satisfactory; when including both high‐ and medium‐resolution attention, under certain hyperparameter settings, the model reaches the highest AUC among all experiments (AUC±SD = 0.99±0.01) and the estimated attention maps concentrate on the regions of interests for this task. Three best‐performing hyperparameter selections according to MSGA were applied to the holdout test set and reached comparable model performance to that of the validation set. ConclusionsOur results suggest that, for a task with only scan‐level labels available, MSGA+RF can utilize the population‐level domain knowledge to guide the training of the network, which increases both model accuracy and explainability. 
    more » « less
  4. Abstract Diet affects cancer risk, and plant-derived polyphenols exhibit cancer-preventive properties. Walnuts are an exceptional source of polyphenolic ellagitannins, converted into urolithins by gut microflora. This clinical study examines the impact of urolithin metabolism on inflammatory markers in blood and colon polyp tissue. We evaluate the effects of walnut consumption on urinary urolithins, serum inflammatory markers, and immune cell markers in polyp tissues obtained from 39 subjects. Together with detailed food frequency data, we perform integrated computational analysis of metabolomic data combined with serum inflammatory markers and spatial imaging of polyp tissues using imaging mass cytometry. LC/MS-MS analyses of urine and fecal samples identify a widely divergent capacity to form nine urolithin metabolites in this patient population. Subjects with higher urolithin A formation exhibit lower levels of several key serologic inflammatory markers, including C-peptide, soluble form of intracellular adhesion molecule 1, sIL-6R, ghrelin, TRAIL, sVEGFR2, platelet-derived growth factor (PDGF), and MCP-2, alterations that are more pronounced in obese individuals for soluble form of intracellular adhesion molecule 1, epithelial neutrophil–activating peptide 78, leptin, glucagon-like peptide 1, and macrophage inflammatory protein 1δ. There is a significant increase in levels of peptide YY associated with urolithin A formation, whereas TNFα levels show an opposite trend, recapitulated in an in vitro system with ionomycin/phorbol 12-myristate 13-acetate–stimulated peripheral blood mononuclear cells (PBMC). Spatial imaging of colon polyp tissues shows altered cell cluster patterns, including a significant reduction of vimentin and CD163 expression associated with urolithin A. The ability to form urolithin A is linked to inflammation, warranting further studies to understand the role of urolithins in cancer prevention. Prevention Relevance: We evaluate cancer-protective effects of walnuts via formation of microbe-derived urolithin A, substantiating their functional benefits on serum inflammatory markers and immunologic composition of polyps in normal/obese subjects. Our approach incorporates personalized nutrition within the context of colonic health, providing the rationale for dietary inclusion of walnut ellagitannins for cancer prevention. 
    more » « less
  5. Fibromyalgia (FM) is a chronic central sensitivity syndrome characterized by augmented pain processing at diffuse body sites and presents as a multimorbid clinical condition. Long COVID (LC) is a heterogenous clinical syndrome that affects 10–20% of individuals following COVID-19 infection. FM and LC share similarities with regard to the pain and other clinical symptoms experienced, thereby posing a challenge for accurate diagnosis. This research explores the feasibility of using surface-enhanced Raman spectroscopy (SERS) combined with soft independent modelling of class analogies (SIMCAs) to develop classification models differentiating LC and FM. Venous blood samples were collected using two supports, dried bloodspot cards (DBS, n = 48 FM and n = 46 LC) and volumetric absorptive micro-sampling tips (VAMS, n = 39 FM and n = 39 LC). A semi-permeable membrane (10 kDa) was used to extract low molecular fraction (LMF) from the blood samples, and Raman spectra were acquired using SERS with gold nanoparticles (AuNPs). Soft independent modelling of class analogy (SIMCA) models developed with spectral data of blood samples collected in VAMS tips showed superior performance with a validation performance of 100% accuracy, sensitivity, and specificity, achieving an excellent classification accuracy of 0.86 area under the curve (AUC). Amide groups, aromatic and acidic amino acids were responsible for the discrimination patterns among FM and LC syndromes, emphasizing the findings from our previous studies. Overall, our results demonstrate the ability of AuNP SERS to identify unique metabolites that can be potentially used as spectral biomarkers to differentiate FM and LC. 
    more » « less