skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Evaluating Latent Space Robustness and Uncertainty of EEG-ML Models under Realistic Distribution Shifts
The recent availability of large datasets in bio-medicine has inspired the development of representation learning methods for multiple healthcare applications. Despite advances in predictive performance, the clinical utility of such methods is limited when exposed to real-world data. This study develops model diagnostic measures to detect potential pitfalls before deployment without assuming access to external data. Specifically, we focus on modeling realistic data shifts in electrophysiological signals (EEGs) via data transforms and extend the conventional task-based evaluations with analyses of a) the model's latent space and b) predictive uncertainty under these transforms. We conduct experiments on multiple EEG feature encoders and two clinically relevant downstream tasks using publicly available large-scale clinical EEGs. Within this experimental setting, our results suggest that measures of latent space integrity and model uncertainty under the proposed data shifts may help anticipate performance degradation during deployment.  more » « less
Award ID(s):
2105233
PAR ID:
10450526
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
36
ISSN:
1049-5258
Page Range / eLocation ID:
21142-21156
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Artificial intelligence-based prostate cancer (PCa) detection models have been widely explored to assist clinical diagnosis. However, these trained models may generate erroneous results specifically on datasets that are not within training distribution. In this paper, we propose an approach to tackle this so-called out-of-distribution (OOD) data problem. Specifically, we devise an end-to-end unsupervised framework to estimate uncertainty values for cases analyzed by a previously trained PCa detection model. Our PCa detection model takes the inputs of bpMRI scans and through our proposed approach we identify OOD cases that are likely to generate degraded performance due to the data distribution shifts. The proposed OOD framework consists of two parts. First, an autoencoder-based reconstruction network is proposed, which learns discrete latent representations of in-distribution data. Second, the uncertainty is computed using perceptual loss that measures the distance between original and reconstructed images in the feature space of a pre-trained PCa detection network. The effectiveness of the proposed framework is evaluated on seven independent data collections with a total of 1,432 cases. The performance of pre-trained PCa detection model is significantly improved by excluding cases with high uncertainty. 
    more » « less
  2. A highly accurate but overconfident model is ill-suited for deployment in critical applications such as healthcare and autonomous driving. The classification outcome should reflect a high uncertainty on ambiguous in-distribution samples that lie close to the decision boundary. The model should also refrain from making overconfident decisions on samples that lie far outside its training distribution, far-out-of-distribution (far-OOD), or on unseen samples from novel classes that lie near its training distribution (near-OOD). This paper proposes an application of counterfactual explanations in fixing an over-confident classifier. Specifically, we propose to fine-tune a given pre-trained classifier using augmentations from a counterfactual explainer (ACE) to fix its uncertainty characteristics while retaining its predictive performance. We perform extensive experiments with detecting far-OOD, near-OOD, and ambiguous samples. Our empirical results show that the revised model has improved uncertainty measures, and its performance is competitive to the state-of-the-art methods. 
    more » « less
  3. Abstract Multivariate spatially oriented data sets are prevalent in the environmental and physical sciences. Scientists seek to jointly model multiple variables, each indexed by a spatial location, to capture any underlying spatial association for each variable and associations among the different dependent variables. Multivariate latent spatial process models have proved effective in driving statistical inference and rendering better predictive inference at arbitrary locations for the spatial process. High‐dimensional multivariate spatial data, which are the theme of this article, refer to data sets where the number of spatial locations and the number of spatially dependent variables is very large. The field has witnessed substantial developments in scalable models for univariate spatial processes, but such methods for multivariate spatial processes, especially when the number of outcomes are moderately large, are limited in comparison. Here, we extend scalable modeling strategies for a single process to multivariate processes. We pursue Bayesian inference, which is attractive for full uncertainty quantification of the latent spatial process. Our approach exploits distribution theory for the matrix‐normal distribution, which we use to construct scalable versions of a hierarchical linear model of coregionalization (LMC) and spatial factor models that deliver inference over a high‐dimensional parameter space including the latent spatial process. We illustrate the computational and inferential benefits of our algorithms over competing methods using simulation studies and an analysis of a massive vegetation index data set. 
    more » « less
  4. Abstract Electrophysiologic disturbances due to neurodegenerative disorders such as Alzheimer’s disease and Lewy Body disease are detectable by scalp EEG and can serve as a functional measure of disease severity. Traditional quantitative methods of EEG analysis often require an a-priori selection of clinically meaningful EEG features and are susceptible to bias, limiting the clinical utility of routine EEGs in the diagnosis and management of neurodegenerative disorders. We present a data-driven tensor decomposition approach to extract the top 6 spectral and spatial features representing commonly known sources of EEG activity during eyes-closed wakefulness. As part of their neurologic evaluation at Mayo Clinic, 11 001 patients underwent 12 176 routine, standard 10–20 scalp EEG studies. From these raw EEGs, we developed an algorithm based on posterior alpha activity and eye movement to automatically select awake-eyes-closed epochs and estimated average spectral power density (SPD) between 1 and 45 Hz for each channel. We then created a three-dimensional (3D) tensor (record × channel × frequency) and applied a canonical polyadic decomposition to extract the top six factors. We further identified an independent cohort of patients meeting consensus criteria for mild cognitive impairment (30) or dementia (39) due to Alzheimer’s disease and dementia with Lewy Bodies (31) and similarly aged cognitively normal controls (36). We evaluated the ability of the six factors in differentiating these subgroups using a Naïve Bayes classification approach and assessed for linear associations between factor loadings and Kokmen short test of mental status scores, fluorodeoxyglucose (FDG) PET uptake ratios and CSF Alzheimer’s Disease biomarker measures. Factors represented biologically meaningful brain activities including posterior alpha rhythm, anterior delta/theta rhythms and centroparietal beta, which correlated with patient age and EEG dysrhythmia grade. These factors were also able to distinguish patients from controls with a moderate to high degree of accuracy (Area Under the Curve (AUC) 0.59–0.91) and Alzheimer’s disease dementia from dementia with Lewy Bodies (AUC 0.61). Furthermore, relevant EEG features correlated with cognitive test performance, PET metabolism and CSF AB42 measures in the Alzheimer’s subgroup. This study demonstrates that data-driven approaches can extract biologically meaningful features from population-level clinical EEGs without artefact rejection or a-priori selection of channels or frequency bands. With continued development, such data-driven methods may improve the clinical utility of EEG in memory care by assisting in early identification of mild cognitive impairment and differentiating between different neurodegenerative causes of cognitive impairment. 
    more » « less
  5. The availability of large-scale electronic health record datasets has led to the development of artificial intel- ligence (AI) methods for clinical risk prediction that help improve patient care. However, existing studies have shown that AI models suffer from severe performance decay after several years of deployment, which might be caused by various temporal dataset shifts. When the shift occurs, we have access to large-scale pre-shift data and small-scale post-shift data that are not enough to train new models in the post-shift environment. In this study, we propose a new method to address the issue. We reweight patients from the pre-shift environ- ment to mitigate the distribution shift between pre- and post-shift environments. Moreover, we adopt a Kullback-Leibler divergence loss to force the models to learn similar patient representations in pre- and post-shift environments. Our experimental results show that our model efficiently mitigates temporal shifts, improving prediction performance. 
    more » « less