skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 6, 2026

Title: Assessing Classification Models of Pharmaceuticals With Conformal Prediction
ABSTRACT Conformal predictions transform a measurable, heuristic notion of uncertainty into statistically valid confidence intervals such that, for a future sample, the true class prediction will be included in the conformal prediction set at a predetermined confidence. In a Bayesian perspective, common estimates of uncertainty in multivariate classification, namelyp‐values, only provide the probability that the data fits the presumed class model,P(D|M). Conformal predictions, on the other hand, address the more meaningful probability that a model fits the data,P(M|D). Herein, two methods to perform inductive conformal predictions are investigated—the traditional Split Conformal Prediction that uses an external calibration set and a novel Bagged Conformal Prediction, closely related to Cross Conformal Predictions, that utilizes bagging to calibrate the heuristic notions of uncertainty. Methods for preprocessing the conformal prediction scores to improve performance are discussed and investigated. These conformal prediction strategies are applied to identifying four non‐steroidal anti‐inflammatory drugs (NSAIDs) from hyperspectral Raman imaging data. In addition to assigning meaningful confidence intervals on the model results, we herein demonstrate how conformal predictions can add additional diagnostics for model quality and method stability.  more » « less
Award ID(s):
2003839
PAR ID:
10576958
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Chemometrics
Volume:
39
Issue:
3
ISSN:
0886-9383
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep learning models are being adopted and applied across various critical medical tasks, yet they are primarily trained to provide point predictions without providing degrees of confidence. Medical practitioner’s trustworthiness of deep learning models is increased when paired with uncertainty estimations. Conformal Prediction has emerged as a promising method to pair machine learning models with prediction intervals, allowing for a view of the model’s uncertainty. However, popular uncertainty estimation methods for conformal prediction fail to provide highly accurate heteroskedastic intervals. In this paper, we propose a method to estimate the uncertainty of each sample by calculating the variance obtained from a Deep Regression Forest. We show that the deep regression forest variance improves the efficiency and coverage of normalized inductive conformal prediction when applied on an anti-cancer drug sensitivity prediction task. 
    more » « less
  2. Abstract Conformal prediction builds marginally valid prediction intervals that cover the unknown outcome of a randomly drawn test point with a prescribed probability. However, in practice, data-driven methods are often used to identify specific test unit(s) of interest, requiring uncertainty quantification tailored to these focal units. In such cases, marginally valid conformal prediction intervals may fail to provide valid coverage for the focal unit(s) due to selection bias. This article presents a general framework for constructing a prediction set with finite-sample exact coverage, conditional on the unit being selected by a given procedure. The general form of our method accommodates arbitrary selection rules that are invariant to the permutation of the calibration units and generalizes Mondrian Conformal Prediction to multiple test units and non-equivariant classifiers. We also work out computationally efficient implementation of our framework for a number of realistic selection rules, including top-K selection, optimization-based selection, selection based on conformal p-values, and selection based on properties of preliminary conformal prediction sets. The performance of our methods is demonstrated via applications in drug discovery and health risk prediction. 
    more » « less
  3. Abstract To assess the effect of uncertainties in solar wind driving on the predictions from the operational configuration of the Space Weather Modeling Framework, we have developed a nonparametric method for generating multiple possible realizations of the solar wind just upstream of the bow shock, based on observations near the first Lagrangian point. We have applied this method to the solar wind inputs at the upstream boundary of Space Weather Modeling Framework and have simulated the geomagnetic storm of 5 April 2010. We ran a 40‐member ensemble for this event and have used this ensemble to quantify the uncertainty in the predicted Sym‐H index and ground magnetic disturbances due to the uncertainty in the upstream boundary conditions. Both the ensemble mean and the unperturbed simulation tend to underpredict the magnitude of Sym‐H in the quiet interval before the storm and overpredict in the storm itself, consistent with previous work. The ensemble mean is a more accurate predictor of Sym‐H, improving the mean absolute error by nearly 2 nT for this interval and displaying a smaller bias. We also examine the uncertainty in predicted maxima in ground magnetic disturbances. The confidence intervals are typically narrow during periods where the predicted dBH/dtis low. The confidence intervals are often much wider where the median prediction is for enhanced dBH/dt. The ensemble also allows us to identify intervals of activity that cannot be explained by uncertainty in the solar wind driver, driving further model improvements. This work demonstrates the feasibility and importance of ensemble modeling for space weather applications. 
    more » « less
  4. Conformal prediction is a powerful tool to generate uncertainty sets with guaranteed coverage using any predictive model, under the assumption that the training and test data are i.i.d.. Recently, it has been shown that adversarial examples are able to manipulate conformal methods to construct prediction sets with invalid coverage rates, as the i.i.d. assumption is violated. To address this issue, a recent work, Randomized Smoothed Conformal Prediction (RSCP), was first proposed to certify the robustness of conformal prediction methods to adversarial noise. However, RSCP has two major limitations: (i) its robustness guarantee is flawed when used in practice and (ii) it tends to produce large uncertainty sets. To address these limitations, we first propose a novel framework called RSCP+ to provide provable robustness guarantee in evaluation, which fixes the issues in the original RSCP method. Next, we propose two novel methods, Post-Training Transformation (PTT) and Robust Conformal Training (RCT), to effectively reduce prediction set size with little computation overhead. Experimental results in CIFAR10, CIFAR100, and ImageNet suggest the baseline method only yields trivial predictions including full label set, while our methods could boost the efficiency by up to 4.36×, 5.46×, and 16.9× respectively and provide practical robustness guarantee. 
    more » « less
  5. In regression problems where there is no known true underlying model, conformal prediction methods enable prediction intervals to be constructed without any assumptions on the distribution of the underlying data, except that the training and test data are assumed to be exchangeable. However, these methods bear a heavy computational cost—and, to be carried out exactly, the regression algorithm would need to be fitted infinitely many times. In practice, the conformal prediction method is run by simply considering only a finite grid of finely spaced values for the response variable. This paper develops discretized conformal prediction algorithms that are guaranteed to cover the target value with the desired probability and that offer a trade‐off between computational cost and prediction accuracy. Copyright © 2018 John Wiley & Sons, Ltd. 
    more » « less