We propose a screening method for high-dimensional data with ordinal competing risk outcomes, which is time-dependent and model-free. Existing methods are designed for cause-specific variable screening and fail to evaluate how a biomarker is associated with multiple competing events simultaneously. The proposed method utilizes the Volume under the ROC surface (VUS), which measures the concordance between values of a biomarker and event status at certain time points and provides an overall evaluation of the discrimination capacity of a biomarker. We show that the VUS possesses the sure screening property, i.e., true important covariates can be retained with probability tending to one, and the size of the selected set can be bounded with high probability. The VUS appears to be a viable model-free screening metric as compared to some existing methods in simulation studies, and it is especially robust to data contamination. Through an analysis of breast-cancer geneexpression data, we illustrate the unique insights into the overall discriminatory capability provided by the VUS.
more »
« less
Prognostic accuracy for predicting ordinal competing risk outcomes using ROC surfaces
Many medical conditions are marked by a sequence of events in association with continuous changes in biomarkers. Few works have evaluated the overall accuracy of a biomarker in predicting disease progression. We thus extend the concept of receiver operating characteristic (ROC) surface and the volume under the surface (VUS) from multi-category outcomes to ordinal competing-risk outcomes that are also subject to noninformative censoring. Two VUS estimators are considered. One is based on the definition of the ROC surface and obtained by integrating the estimated ROC surface. The other is an inverse probability weighted U estimator that is built upon the equivalence of the VUS to the concordance probability between the marker and sequential outcomes. Both estimators have nice asymptotic results that can be derived using counting process techniques and U-statistics theory.We illustrate their good practical performances through simulations and applications to two studies of cognition and a transplant dataset.
more »
« less
- Award ID(s):
- 1916001
- PAR ID:
- 10556492
- Publisher / Repository:
- Springer
- Date Published:
- Journal Name:
- Lifetime Data Analysis
- Volume:
- 28
- Issue:
- 1
- ISSN:
- 1380-7870
- Page Range / eLocation ID:
- 1 to 22
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The optimal receiver operating characteristic (ROC) curve, giving the maximum probability of detection as a function of the probability of false alarm, is a key information-theoretic indicator of the difficulty of a binary hypothesis testing problem (BHT). It is well known that the optimal ROC curve for a given BHT, corresponding to the likelihood ratio test, is theoretically determined by the probability distribution of the observed data under each of the two hypotheses. In some cases, these two distributions may be unknown or computationally intractable, but independent samples of the likelihood ratio can be observed. This raises the problem of estimating the optimal ROC for a BHT from such samples. The maximum likelihood estimator of the optimal ROC curve is derived, and it is shown to converge to the true optimal ROC curve in the \levy\ metric, as the number of observations tends to infinity. A classical empirical estimator, based on estimating the two types of error probabilities from two separate sets of samples, is also considered. The maximum likelihood estimator is observed in simulation experiments to be considerably more accurate than the empirical estimator, especially when the number of samples obtained under one of the two hypotheses is small. The area under the maximum likelihood estimator is derived; it is a consistent estimator of the true area under the optimal ROC curve.more » « less
-
This paper studies theCox model with time-varying coefficients for cause-specific hazard functions when the causes of failure are subject to missingness. Inverse probability weighted and augmented inverse probability weighted estimators are investigated. The latter is considered as a two-stage estimator by directly utilizing the inverse probability weighted estimator and through modeling available auxiliary variables to improve efficiency. The asymptotic properties of the two estimators are investigated. Hypothesis testing procedures are developed to test the null hypotheses that the covariate effects are zero and that the covariate effects are constant. We conduct simulation studies to examine the finite sample properties of the proposed estimation and hypothesis testing procedures under various settings of the auxiliary variables and the percentages of the failure causes that are missing. These simulation results demonstrate that the augmented inverse probability weighted estimators are more efficient than the inverse probability weighted estimators and that the proposed testing procedures have the expected satisfactory results in sizes and powers. The proposed methods are illustrated using the Mashi clinical trial data for investigating the effect of randomization to formula-feeding versus breastfeeding plus extended infant zidovudine prophylaxis on death due to mother-to-child HIV transmission in Botswana.more » « less
-
Background: Variants within factor VIII (F8) are associated with sex-linked hemophilia A and thrombosis, with gene therapy approaches being available for pathogenic variants. Many variants within F8 remain variants of uncertain significance (VUS) or are under-explored as to their connections to phenotypic outcomes. Methods: We assessed data on F8 expression while screening the UniProt, ClinVar, Geno2MP, and gnomAD databases for F8 missense variants; these collectively represent the sequencing of more than a million individuals. Results: For the two F8 isoforms coding for different protein lengths (2351 and 216 amino acids), we observed noncoding variants influencing expression which are also associated with thrombosis risk, with uncertainty as to differences in females and males. Variant analysis identified a severe stratification of potential annotation issues for missense variants in subjects of non-European ancestry, suggesting a need for further defining the genetics of diverse populations. Additionally, few heterozygous female carriers of known pathogenic variants have sufficiently confident phenotyping data, leaving researchers unable to determine subtle, less defined phenotypes. Using structure movement correlations to known pathogenic variants for the VUS, we determined seven clusters of likely pathogenic variants based on screening work. Conclusions: This work highlights the need to define missense variants, especially those for VUS and from subjects of non-European ancestry, as well as the roles of these variants in women’s physiology.more » « less
-
null (Ed.)In this study, we investigate and develop scaling laws as a function of external non-dimensional control parameters for heat and momentum transport for non-rotating, slowly rotating and rapidly rotating turbulent convection systems, with the end goal of forging connections and bridging the various gaps between these regimes. Two perspectives are considered, one where turbulent convection is viewed from the standpoint of an applied temperature drop across the domain and the other with a viewpoint in terms of an applied heat flux. While a straightforward transformation exist between the two perspectives indicating equivalence, it is found the former provides a clear set of connections that bridge between the three regimes. Our generic convection scalings, based upon an Inertial-Archimedean balance, produce the classic diffusion-free scalings for the non-rotating limit (NRL) and the slowly rotating limit (SRL). This is characterized by a free-falling fluid parcel on the global scale possessing a thermal anomaly on par with the temperature drop across the domain. In the rapidly rotating limit (RRL), the generic convection scalings are based on a Coriolis-Inertial-Archimedean (CIA) balance, along with a local fluctuating-mean advective temperature balance. This produces a scenario in which anisotropic fluid parcels attain a thermal wind velocity and where the thermal anomalies are greatly attenuated compared to the total temperature drop. We find that turbulent scalings may be deduced simply by consideration of the generic non-dimensional transport parameters --- local Reynolds $$Re_\ell = U \ell /\nu$$; local P\'eclet $$Pe_\ell = U \ell /\kappa$$; and Nusselt number $$Nu = U \vartheta/(\kappa \Delta T/H)$$ --- through the selection of physically relevant estimates for length $$\ell$$, velocity $$U$$ and temperature scales $$\vartheta$$ in each regime. Emergent from the scaling analyses is a unified continuum based on a single external control parameter, the convective Rossby number\JMA{,} $$\RoC = \sqrt{g \alpha \Delta T / 4 \Omega^2 H}$$, that strikingly appears in each regime by consideration of the local, convection-scale Rossby number $$\Rol=U/(2\Omega \ell)$$. Thus we show that $$\RoC$$ scales with the local Rossby number $$\Rol$$ in both the slowly rotating and the rapidly rotating regimes, explaining the ubiquity of $$\RoC$$ in rotating convection studies. We show in non-, slowly, and rapidly rotating systems that the convective heat transport, parameterized via $$Pe_\ell$$, scales with the total heat transport parameterized via the Nusselt number $Nu$. Within the rapidly-rotating limit, momentum transport arguments generate a scaling for the system-scale Rossby number, $$Ro_H$$, that, recast in terms of the total heat flux through the system, is shown to be synonymous with the classical flux-based `CIA' scaling, $$Ro_{CIA}$$. These, in turn, are then shown to asymptote to $$Ro_H \sim Ro_{CIA} \sim \RoC^2$$, demonstrating that these momentum transport scalings are identical in the limit of rapidly rotating turbulent heat transfer.more » « less
An official website of the United States government

