skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Statistical inference with semiparametric nonignorable nonresponse models
Abstract How to deal with nonignorable response is often a challenging problem encountered in statistical analysis with missing data. Parametric model assumption for the response mechanism is sensitive to model misspecification. We consider a semiparametric response model that relaxes the parametric model assumption in the response mechanism. Two types of efficient estimators, profile maximum likelihood estimator and profile calibration estimator, are proposed, and their asymptotic properties are investigated. Two extensive simulation studies are used to compare with some existing methods. We present an application of our method using data from the Korean Labor and Income Panel Survey.  more » « less
Award ID(s):
1931380
PAR ID:
10420822
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Scandinavian Journal of Statistics
Volume:
50
Issue:
4
ISSN:
0303-6898
Format(s):
Medium: X Size: p. 1795-1817
Size(s):
p. 1795-1817
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary Panel count data arise when the number of recurrent events experienced by each subject is observed intermittently at discrete examination times. The examination time process can be informative about the underlying recurrent event process even after conditioning on covariates. We consider a semiparametric accelerated mean model for the recurrent event process and allow the two processes to be correlated through a shared frailty. The regression parameters have a simple marginal interpretation of modifying the time scale of the cumulative mean function of the event process. A novel estimation procedure for the regression parameters and the baseline rate function is proposed based on a conditioning technique. In contrast to existing methods, the proposed method is robust in the sense that it requires neither the strong Poisson-type assumption for the underlying recurrent event process nor a parametric assumption on the distribution of the unobserved frailty. Moreover, the distribution of the examination time process is left unspecified, allowing for arbitrary dependence between the two processes. Asymptotic consistency of the estimator is established, and the variance of the estimator is estimated by a model-based smoothed bootstrap procedure. Numerical studies demonstrated that the proposed point estimator and variance estimator perform well with practical sample sizes. The methods are applied to data from a skin cancer chemoprevention trial. 
    more » « less
  2. We propose a new estimator for average causal effects of a binary treatment with panel data in settings with general treatment patterns. Our approach augments the popular two‐way‐fixed‐effects specification with unit‐specific weights that arise from a model for the assignment mechanism. We show how to construct these weights in various settings, including the staggered adoption setting, where units opt into the treatment sequentially but permanently. The resulting estimator converges to an average (over units and time) treatment effect under the correct specification of the assignment model, even if the fixed‐ effect model is misspecified. We show that our estimator is more robust than the conventional two‐way estimator: it remains consistent if either the assignment mechanism or the two‐way regression model is correctly specified. In addition, the proposed estimator performs better than the two‐way‐fixed‐effect estimator if the outcome model and assignment mechanism are locally misspecified. This strong robustness property underlines and quantifies the benefits of modeling the assignment process and motivates using our estimator in practice. We also discuss an extension of our estimator to handle dynamic treatment effects. 
    more » « less
  3. Summary It is important to draw causal inference from observational studies, but this becomes challenging if the confounders have missing values. Generally, causal effects are not identifiable if the confounders are missing not at random. In this article we propose a novel framework for nonparametric identification of causal effects with confounders subject to an outcome-independent missingness, which means that the missing data mechanism is independent of the outcome, given the treatment and possibly missing confounders. We then propose a nonparametric two-stage least squares estimator and a parametric estimator for causal effects. 
    more » « less
  4. Abstract Here we present a machine learning–based wind reconstruction model. The model reconstructs hurricane surface winds with XGBoost, which is a decision-tree-based ensemble predictive algorithm. The model treats the symmetric and asymmetric wind fields separately. The symmetric wind field is approximated by a parametric wind profile model and two Bessel function series. The asymmetric field, accounting for asymmetries induced by the storm and its ambient environment, is represented using a small number of Laplacian eigenfunctions. The coefficients associated with Bessel functions and eigenfunctions are predicted by XGBoost based on storm and environmental features taken from NHC best-track and ERA-Interim data, respectively. We use HWIND for the observed wind fields. Three parametric wind profile models are tested in the symmetric wind model. The wind reconstruction model’s performance is insensitive to the choice of the profile model because the Bessel function series correct biases of the parametric profiles. The mean square error of the reconstructed surface winds is smaller than the climatological variance, indicating skillful reconstruction. Storm center location, eyewall size, and translation speed play important roles in controlling the magnitude of the leading asymmetries, while the phase of the asymmetries is mainly affected by storm translation direction. Vertical wind shear impacts the asymmetry phase to a lesser degree. Intended applications of this model include assessing hurricane risk using synthetic storm event sets generated by statistical–dynamical downscaling hurricane models. 
    more » « less
  5. Abstract The analysis of time series data with detection limits is challenging due to the high‐dimensional integral involved in the likelihood. Existing methods are either computationally demanding or rely on restrictive parametric distributional assumptions. We propose a semiparametric approach, where the temporal dependence is captured by parametric copula, while the marginal distribution is estimated non‐parametrically. Utilizing the properties of copulas, we develop a new copula‐based sequential sampling algorithm, which provides a convenient way to calculate the censored likelihood. Even without full parametric distributional assumptions, the proposed method still allows us to efficiently compute the conditional quantiles of the censored response at a future time point, and thus construct both point and interval predictions. We establish the asymptotic properties of the proposed pseudo maximum likelihood estimator, and demonstrate through simulation and the analysis of a water quality data that the proposed method is more flexible and leads to more accurate predictions than Gaussian‐based methods for non‐normal data.The Canadian Journal of Statistics47: 438–454; 2019 © 2019 Statistical Society of Canada 
    more » « less