skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Assessment of Conformal Prediction and Standard Normal Distribution for Autonomous Consensus One‐Class Classification
ABSTRACT Determining if target samples are members of a particular source class of samples has a large variety of applications within many disciplines. In particular, one‐class classification (OCC) is essential in many areas, such as food contamination or product authentication. There are numerous widely accepted methods for OCC, but these OCC methods involve optimizing tuning parameters such as the number of principal components (PCs). This study presents the development and application of a rigorous autonomous OCC process based on a hybrid fusion consensus technique, termed consensus OCC (Con OCC). The Con OCC method uses the new physicochemical responsive integrated similarity measure (PRISM) composed of multiple similarity measures all independent of optimization. Similarity values are fused to a single value describing the degree of sample similarity to a collection of samples. Two approaches are developed to translate each sample‐wise PRISM value to a probability of class membership: conformal predictionp‐values andz‐scores. These two methods are evaluated as separate Con OCC processes using seven datasets measured across a variety of instruments. In both cases, class membership labels are not used to set decision thresholds, and classifiers are not optimized relative to respective tuning parameters. Results indicate thatz‐scoring often produces better results, but conformal prediction provides greater consistency across datasets. That is,z‐score values tend to range across datasets while conformal predictionp‐values do not.  more » « less
Award ID(s):
2305020
PAR ID:
10559675
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Chemometrics
Volume:
39
Issue:
1
ISSN:
0886-9383
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Conformal predictions transform a measurable, heuristic notion of uncertainty into statistically valid confidence intervals such that, for a future sample, the true class prediction will be included in the conformal prediction set at a predetermined confidence. In a Bayesian perspective, common estimates of uncertainty in multivariate classification, namelyp‐values, only provide the probability that the data fits the presumed class model,P(D|M). Conformal predictions, on the other hand, address the more meaningful probability that a model fits the data,P(M|D). Herein, two methods to perform inductive conformal predictions are investigated—the traditional Split Conformal Prediction that uses an external calibration set and a novel Bagged Conformal Prediction, closely related to Cross Conformal Predictions, that utilizes bagging to calibrate the heuristic notions of uncertainty. Methods for preprocessing the conformal prediction scores to improve performance are discussed and investigated. These conformal prediction strategies are applied to identifying four non‐steroidal anti‐inflammatory drugs (NSAIDs) from hyperspectral Raman imaging data. In addition to assigning meaningful confidence intervals on the model results, we herein demonstrate how conformal predictions can add additional diagnostics for model quality and method stability. 
    more » « less
  2. A<sc>bstract</sc> We explore a large class of correlation measures called theα−zRényi mutual informations (RMIs). Unlike the commonly used notion of RMI involving linear combinations of Rényi entropies, theα−zRMIs are positive semi-definite and monotonically decreasing under local quantum operations, making them sensible measures of total (quantum and classical) correlations. This follows from their descendance from Rényi relative entropies. In addition to upper bounding connected correlation functions between subsystems, we prove the much stronger statement that for certain values ofαandz, theα−zRMIs also lower bound certain connected correlation functions. We develop an easily implementable replica trick which enables us to compute theα−zRMIs in a variety of many-body systems including conformal field theories, free fermions, random tensor networks, and holography. 
    more » « less
  3. We develop fast distribution-free conformal prediction algorithms for obtaining multivalid coverage on exchangeable data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees in two ways: (1) They hold even conditional on group membership---that is, the target coverage level holds conditionally on membership in each of an arbitrary (potentially intersecting) group in a finite collection of regions in the feature space. (2) They hold even conditional on the value of the threshold used to produce the prediction set on a given example. In fact multivalid coverage guarantees hold even when conditioning on group membership and threshold value simultaneously. We give two algorithms: both take as input an arbitrary non-conformity score and an arbitrary collection of possibly intersecting groups , and then can equip arbitrary black-box predictors with prediction sets. Our first algorithm is a direct extension of quantile regression, needs to solve only a single convex minimization problem, and produces an estimator which has group-conditional guarantees for each group in . Our second algorithm is iterative, and gives the full guarantees of multivalid conformal prediction: prediction sets that are valid conditionally both on group membership and non-conformity threshold. We evaluate the performance of both of our algorithms in an extensive set of experiments. 
    more » « less
  4. While developing continuous authentication systems (CAS), we generally assume that samples from both genuine and impostor classes are readily available. However, the assumption may not be true in certain circumstances. Therefore, we explore the possibility of implementing CAS using only genuine samples. Specifically, we investigate the usefulness of four one-class classifiers OCC (elliptic envelope, isolation forest, local outliers factor, and one-class support vector machines) and their fusion. The performance of these classifiers was evaluated on four distinct behavioral biometric datasets, and compared with eight multi-class classifiers (MCC). The results demonstrate that if we have sufficient training data from the genuine user the OCC, and their fusion can closely match the performance of the majority of MCC. Our findings encourage the research community to use OCC in order to build CAS as it does not require knowledge of impostor class during the enrollment process. 
    more » « less
  5. We propose a model-free framework for sensitivity analysis of individual treatment effects (ITEs), building upon ideas from conformal inference. For any unit, our procedure reports the Γ-value, a number which quantifies the minimum strength of confounding needed to explain away the evidence for ITE. Our approach rests on the reliable predictive inference of counterfactuals and ITEs in situations where the training data are confounded. Under the marginal sensitivity model of [Z. Tan, J. Am. Stat. Assoc. 101, 1619-1637 (2006)], we characterize the shift between the distribution of the observations and that of the counterfactuals. We first develop a general method for predictive inference of test samples from a shifted distribution; we then leverage this to construct covariate-dependent prediction sets for counterfactuals. No matter the value of the shift, these prediction sets (resp. approximately) achieve marginal coverage if the propensity score is known exactly (resp. estimated). We describe a distinct procedure also attaining coverage, however, conditional on the training data. In the latter case, we prove a sharpness result showing that for certain classes of prediction problems, the prediction intervals cannot possibly be tightened. We verify the validity and performance of the methods via simulation studies and apply them to analyze real datasets. 
    more » « less