Abstract In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more efficient HTE estimate, such integrative analysis requires great insight into the data generation mechanism, particularly the accurate characterization of unmeasured confounding effects/bias. With this aim, we propose a penalized-regression-based integrative approach that allows for the simultaneous estimation of parameters, selection of variables, and identification of the existence of unmeasured confounding effects. The consistency, asymptotic normality, and efficiency gains are rigorously established for the proposed estimate. Finally, we apply the proposed method to estimate the HTE of lobar/sublobar resection on the survival of lung cancer patients. The RCT is a multicenter non-inferiority randomized phase 3 trial, and the RWD comes from a clinical oncology cancer registry in the United States. The analysis reveals that the unmeasured confounding exists and the integrative approach does enhance the efficiency for the HTE estimation.
more »
« less
Assessment of heterogeneous treatment effect estimation accuracy via matching
We study the assessment of the accuracy of heterogeneous treatment effect (HTE) estimation, where the HTE is not directly observable so standard computation of prediction errors is not applicable. To tackle the difficulty, we propose an assessment approach by constructing pseudo‐observations of the HTE based on matching. Our contributions are three‐fold: first, we introduce a novel matching distance derived from proximity scores in random forests; second, we formulate the matching problem as an average minimum‐cost flow problem and provide an efficient algorithm; third, we propose a match‐then‐split principle for the assessment with cross‐validation. We demonstrate the efficacy of the assessment approach using simulations and a real dataset.
more »
« less
- PAR ID:
- 10449431
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Statistics in Medicine
- Volume:
- 40
- Issue:
- 17
- ISSN:
- 0277-6715
- Format(s):
- Medium: X Size: p. 3990-4013
- Size(s):
- p. 3990-4013
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The field of sustainability science has grown significantly over the past two decades in terms of both conceptual development and empirical research. Systems-focused analysis is critical to building generalizable knowledge in the field, yet much relevant research does not take a systems view. Systems-oriented analytical frameworks can help researchers conceptualize and analyze sustainability-relevant systems, but existing frameworks may lack access or utility outside a particular research tradition. In this article, we outline the human–technical–environmental (HTE) framework, which provides analysts from different disciplinary backgrounds and fields of study a common way to advance systems-focused research on sustainability issues. We detail a step-by-step guide for the application of the HTE framework through a matrix-based approach for identifying system components, studying interactions among system components, and examining interventions targeting components and/or their interactions for the purpose of advancing sustainability. We demonstrate the applicability of the HTE framework and the matrix-based approach through an analysis of an empirical case of coal-fired power plants and mercury pollution, which is relevant to large-scale sustainability transitions. Based on this analysis, we identify specific insights related to the applicability of upstream and downstream leverage points, connections between energy markets and the use of pollution control technologies, and the importance of institutions fitting both biophysical dynamics and socioeconomic and political dynamics. Further application of the HTE framework and the identification of insights can help develop systems-oriented analysis, and inform societal efforts to advance sustainability, as well as contribute to the formulation of empirically grounded middle-range theories related to sustainability systems and sustainability transitions. We conclude with a discussion of areas for further development and application of the HTE framework.more » « less
-
We propose a unified data-driven framework based on inverse optimal transport that can learn adaptive, nonlinear interaction cost function from noisy and incomplete empirical matching matrix and predict new matching in various matching contexts. We emphasize that the discrete optimal transport plays the role of a variational principle which gives rise to an optimization based framework for modeling the observed empirical matching data. Our formulation leads to a non-convex optimization problem which can be solved efficiently by an alternating optimization method. A key novel aspect of our formulation is the incorporation of marginal relaxation via regularized Wasserstein distance, significantly improving the robustness of the method in the face of noisy or missing empirical matching data. Our model falls into the category of prescriptive models, which not only predict potential future matching, but is also able to explain what leads to empirical matching and quantifies the impact of changes in matching factors. The proposed approach has wide applicability including predicting matching in online dating, labor market, college application and crowdsourcing. We back up our claims with numerical experiments on both synthetic data and real world data sets.more » « less
-
Abstract In this paper, we address the problem of estimating transport surplus (a.k.a. matching affinity) in high-dimensional optimal transport problems. Classical optimal transport theory specifies the matching affinity and determines the optimal joint distribution. In contrast, we study the inverse problem of estimating matching affinity based on the observation of the joint distribution, using an entropic regularization of the problem. To accommodate high dimensionality of the data, we propose a novel method that incorporates a nuclear norm regularization that effectively enforces a rank constraint on the affinity matrix. The low-rank matrix estimated in this way reveals the main factors that are relevant for matching.more » « less
-
This paper investigates an ordered partial matching alignment problem, in which the goal is to align two sequences in the presence of potentially non-matching regions. We propose a novel parameter-free dynamic programming alignment method called hidden state time warping that allows an alignment path to switch between two different planes: a “visible” plane corresponding to matching sections and a “hidden” plane corresponding to non-matching sections. By defining two distinct planes, we can allow different types of time warping in each plane (e.g., imposing a maximum warping factor in matching regions while allowing completely unconstrained movements in non-matching regions). The resulting algorithm can determine the optimal continuous alignment path via dynamic programming, and the visible plane induces a (possibly) discontinuous alignment path containing matching regions. We show that this approach outperforms existing parameter-free methods on two different partial matching alignment problems involving speech and music.more » « less
An official website of the United States government
