skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee
Kernel survival analysis models estimate individual survival distributions with the help of a kernel function, which measures the similarity between any two data points. Such a kernel function can be learned using deep kernel survival models. In this paper, we present a new deep kernel survival model called a survival kernet, which scales to large datasets in a manner that is amenable to model interpretation and also theoretical analysis. Specifically, the training data are partitioned into clusters based on a recently developed training set compression scheme for classification and regression called kernel netting that we extend to the survival analysis setting. At test time, each data point is represented as a weighted combination of these clusters, and each such cluster can be visualized. For a special case of survival kernets, we establish a finite-sample error bound on predicted survival distributions that is, up to a log factor, optimal. Whereas scalability at test time is achieved using the aforementioned kernel netting compression strategy, scalability during training is achieved by a warm-start procedure based on tree ensembles such as XGBoost and a heuristic approach to accelerating neural architecture search. On four standard survival analysis datasets of varying sizes (up to roughly 3 million data points), we show that survival kernets are highly competitive compared to various baselines tested in terms of time-dependent concordance index. Our code is available at: https://github.com/georgehc/survival-kernets  more » « less
Award ID(s):
2047981
PAR ID:
10514100
Author(s) / Creator(s):
Publisher / Repository:
Journal of Machine Learning Research
Date Published:
Journal Name:
Journal of machine learning research
ISSN:
1533-7928
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep models trained through maximum likelihood have achieved state-of-the-art results for survival analysis. Despite this training scheme, practitioners evaluate models under other criteria, such as binary classification losses at a chosen set of time horizons, e.g. Brier score (BS) and Bernoulli log likelihood (BLL). Models trained with maximum likelihood may have poor BS or BLL since maximum likelihood does not directly optimize these criteria. Directly optimizing criteria like BS requires inverse-weighting by the censoring distribution. However, estimating the censoring model under these metrics requires inverse-weighting by the failure distribution. The objective for each model requires the other, but neither are known. To resolve this dilemma, we introduce Inverse-Weighted Survival Games. In these games, objectives for each model are built from re-weighted estimates featuring the other model, where the latter is held fixed during training. When the loss is proper, we show that the games always have the true failure and censoring distributions as a stationary point. This means models in the game do not leave the correct distributions once reached. We construct one case where this stationary point is unique. We show that these games optimize BS on simulations and then apply these principles on real world cancer and critically-ill patient data. 
    more » « less
  2. Deep models trained through maximum likelihood have achieved state-of-the-art results for survival analysis. Despite this training scheme, practitioners evaluate models under other criteria, such as binary classification losses at a chosen set of time horizons, e.g. Brier score (BS) and Bernoulli log likelihood (BLL). Models trained with maximum likelihood may have poor BS or BLL since maximum likelihood does not directly optimize these criteria. Directly optimizing criteria like BS requires inverse-weighting by the censoring distribution. However, estimating the censoring model under these metrics requires inverse-weighting by the failure distribution. The objective for each model requires the other, but neither are known. To resolve this dilemma, we introduce Inverse-Weighted Survival Games. In these games, objectives for each model are built from re-weighted estimates featuring the other model, where the latter is held fixed during training. When the loss is proper, we show that the games always have the true failure and censoring distributions as a stationary point. This means models in the game do not leave the correct distributions once reached. We construct one case where this stationary point is unique. We show that these games optimize BS on simulations and then apply these principles on real world cancer and critically-ill patient data. Supplementary Material: pdf 
    more » « less
  3. Survival analysis aims at predicting time to event of interest along with its probability on longitudinal data. It is commonly used to make predictions for a single specific event of interest at a given time point. However, predicting the occurrence of multiple events simultaneously and dynamically is needed in many applications. An intuitive way to solve this problem is to simply apply the regular survival analysis method independently to each task at each time point. However, it often leads to a suboptimal solution since the underlying dependencies between tasks are ignored, which motivates us to analyze these tasks jointly to select common features shared across all tasks. In this paper, we formulate a temporal Multi-Task learning framework (MTMT) using tensor representation. More specifically, given a survival dataset and a sequence of time points, which are considered as the monitored time points, we model each task at each time point as a regular survival analysis problem and optimize them simultaneously. We demonstrate the performance of MTMT model on two real-world datasets. We show the superior performance of the MTMT model compared to several state-of-the-art models. We also provide the list of important features selected to demonstrate the interpretability of our model. 
    more » « less
  4. Summary A general, rectangular kernel matrix may be defined as where is a kernel function and where and are two sets of points. In this paper, we seek a low‐rank approximation to a kernel matrix where the sets of points and are large and are arbitrarily distributed, such as away from each other, “intermingled”, identical, and so forth. Such rectangular kernel matrices may arise, for example, in Gaussian process regression where corresponds to the training data and corresponds to the test data. In this case, the points are often high‐dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly or nearly linearly with respect to the size of data for a fixed approximation rank. The main idea in this paper is to geometrically select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed. 
    more » « less
  5. Abstract AimAdult survival is central to theories explaining latitudinal gradients in life history strategies. Life history theory predicts higher adult survival in tropical than north temperate regions given lower fecundity and parental effort. Early studies were consistent with this prediction, but standard‐effort netting studies in recent decades suggested that apparent survival rates in temperate and tropical regions strongly overlap. Such results do not fit with life history theory. Targeted marking and resighting of breeding adults yielded higher survival estimates in the tropics, but this approach is thought to overestimate survival because it does not sample social and age classes with lower survival. We compared the effect of field methods on tropical survival estimates and their relationships with life history traits. LocationSabah, Malaysian Borneo. Time period2008–2016. Major taxonPasseriformes. MethodsWe used standard‐effort netting and resighted individuals of all social and age classes of 18 tropical songbird species over 8 years. We compared apparent survival estimates between these two field methods with differing analytical approaches. ResultsEstimated detection and apparent survival probabilities from standard‐effort netting were similar to those from other tropical studies that used standard‐effort netting. Resighting data verified that a high proportion of individuals that were never recaptured in standard‐effort netting remained in the study area, and many were observed breeding. Across all analytical approaches, addition of resighting yielded substantially higher survival estimates than did standard‐effort netting alone. These apparent survival estimates were higher than for temperate zone species, consistent with latitudinal differences in life histories. Moreover, apparent survival estimates from addition of resighting, but not from standard‐effort netting alone, were correlated with parental effort as measured by egg temperature across species. Main conclusionsInclusion of resighting showed that standard‐effort netting alone can negatively bias apparent survival estimates and obscure life history relationships across latitudes and among tropical species. 
    more » « less