skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 23, 2026

Title: Conformal survival bands for risk screening under right-censoring
We propose a method to quantify uncertainty around individual survival distribution estimates using right-censored data, compatible with any survival model. Unlike classical confidence intervals, the survival bands produced by this method offer predictive rather than population-level inference, making them useful for personalized risk screening. For example, in a low-risk screening scenario, they can be applied to flag patients whose survival band at 12 months lies entirely above 50\%, while ensuring that at least half of flagged individuals will survive past that time on average. Our approach builds on recent advances in conformal inference and integrates ideas from inverse probability of censoring weighting and multiple testing with false discovery rate control. We provide asymptotic guarantees and show promising performance in finite samples with both simulated and real data.  more » « less
Award ID(s):
2210637
PAR ID:
10625824
Author(s) / Creator(s):
;
Publisher / Repository:
Proceedings of Machine Learning Research
Date Published:
Format(s):
Medium: X
Location:
14th Symposium on Conformal and Probabilistic Prediction with Applications
Sponsoring Org:
National Science Foundation
More Like this
  1. Big data is ubiquitous in various fields of sciences, engineering, medicine, social sciences, and humanities. It is often accompanied by a large number of variables and features. While adding much greater flexibility to modeling with enriched feature space, ultra-high dimensional data analysis poses fundamental challenges to scalable learning and inference with good statistical efficiency. Sure independence screening is a simple and effective method to this endeavor. This framework of two-scale statistical learning, consisting of large-scale screening followed by moderate-scale variable selection introduced in Fan and Lv (2008), has been extensively investigated and extended to various model settings ranging from parametric to semiparametric and nonparametric for regression, classification, and survival analysis. This article provides an overview on the developments of sure independence screening over the past decade. These developments demonstrate the wide applicability of the sure independence screening based learning and inference for big data analysis with desired scalability and theoretical guarantees. 
    more » « less
  2. null (Ed.)
    This chapter provides a selective review on feature screening methods for ultra-high dimensional data. The main idea of feature screening is reducing the ultrahigh dimensionality of the feature space to a moderate size in a fast and efficient way and meanwhile retaining all the important features in the reduced feature space. This is referred to as the sure screening property. After feature screening, more sophisticated methods can be applied to reduced feature space for further analysis such as parameter estimation and statistical inference. This chapter only focuses on the feature screening stage. From the perspective of different types of data, we review feature screening methods for independent and identically distributed data, longitudinal data and survival data. From the perspective of modeling, we review various models including linear model, generalized linear model, additive model, varying-coefficient model, Cox model, etc. We also cover some model-free feature screening procedures. 
    more » « less
  3. null (Ed.)
    Cyber insurance like other types of insurance is a method of risk transfer, where the insured pays a premium in exchange for coverage in the event of a loss. As a result of the reduced risk for the insured and the lack of information on the insurer’s side, the insured is generally inclined to lower its effort, leading to a worse state of security, a common phenomenon known as moral hazard. To mitigate moral hazard, a widely employed concept is premium discrimination, i.e., an agent/insured who exerts higher effort pays less premium. This, however, relies on the insurer’s ability to assess the effort exerted by the insured. In this paper, we study two methods of premium discrimination that rely on two different types of assessment: pre-screening and post-screening. Pre-screening occurs before the insured enters into a contract and can be done at the beginning of each contract period; the result of this process gives the insurer an estimated risk on the insured, which then determines the contract terms. The post-screening mechanism involves at least two contract periods whereby the second-period premium is increased if a loss event occurs during the first period. Prior work shows that both pre-screening and post-screening are generally effective in mitigating moral hazard and increasing the insured’s effort. The analysis in this study shows, however, that the conclusion becomes more nuanced when loss events are rare. Specifically, we show that post-screening is not effective at all with rare losses, while pre-screening can be an effective method when the agent perceives them as rarer than the insurer does; in this case pre-screening improves both the agent’s effort level and the insurer’s profit. 
    more » « less
  4. Yanwu, Xu (Ed.)
    Lung cancer is a major cause of cancer-related deaths, and early diagnosis and treatment are crucial for improving patients’ survival outcomes. In this paper, we propose to employ convolutional neural networks to model the non-linear relationship between the risk of lung cancer and the lungs’ morphology revealed in the CT images. We apply a mini-batched loss that extends the Cox proportional hazards model to handle the non-convexity induced by neural networks, which also enables the training of large data sets. Additionally, we propose to combine mini-batched loss and binary cross-entropy to predict both lung cancer occurrence and the risk of mortality. Simulation results demonstrate the effectiveness of both the mini-batched loss with and without the censoring mechanism, as well as its combination with binary cross-entropy. We evaluate our approach on the National Lung Screening Trial data set with several 3D convolutional neural network architectures, achieving high AUC and C-index scores for lung cancer classification and survival prediction. These results, obtained from simulations and real data experiments, highlight the potential of our approach to improving the diagnosis and treatment of lung cancer. 
    more » « less
  5. We propose a screening method for high-dimensional data with ordinal competing risk outcomes, which is time-dependent and model-free. Existing methods are designed for cause-specific variable screening and fail to evaluate how a biomarker is associated with multiple competing events simultaneously. The proposed method utilizes the Volume under the ROC surface (VUS), which measures the concordance between values of a biomarker and event status at certain time points and provides an overall evaluation of the discrimination capacity of a biomarker. We show that the VUS possesses the sure screening property, i.e., true important covariates can be retained with probability tending to one, and the size of the selected set can be bounded with high probability. The VUS appears to be a viable model-free screening metric as compared to some existing methods in simulation studies, and it is especially robust to data contamination. Through an analysis of breast-cancer geneexpression data, we illustrate the unique insights into the overall discriminatory capability provided by the VUS. 
    more » « less