skip to main content


Title: Inference for Linear Models with Dependent Errors
Summary

The paper is concerned with inference for linear models with fixed regressors and weakly dependent stationary time series errors. Theoretically, we obtain asymptotic normality for the M-estimator of the regression parameter under mild conditions and establish a uniform Bahadur representation for recursive M-estimators. Methodologically, we extend the recently proposed self-normalized approach of Shao from stationary time series to the regression set-up, where the sequence of response variables is typically non-stationary in mean. Since the limiting distribution of the self-normalized statistic depends on the design matrix and its corresponding critical values are case dependent, we develop a simulation-based approach to approximate the critical values consistently. Through a simulation study, we demonstrate favourable finite sample performance of our method in comparison with a block-bootstrap-based approach. Empirical illustrations using two real data sets are also provided.

 
more » « less
NSF-PAR ID:
10401220
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
75
Issue:
2
ISSN:
1369-7412
Page Range / eLocation ID:
p. 323-343
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    We propose a new method to construct confidence intervals for quantities that are associated with a stationary time series, which avoids direct estimation of the asymptotic variances. Unlike the existing tuning-parameter-dependent approaches, our method has the attractive convenience of being free of any user-chosen number or smoothing parameter. The interval is constructed on the basis of an asymptotically distribution-free self-normalized statistic, in which the normalizing matrix is computed by using recursive estimates. Under mild conditions, we establish the theoretical validity of our method for a broad class of statistics that are functionals of the empirical distribution of fixed or growing dimension. From a practical point of view, our method is conceptually simple, easy to implement and can be readily used by the practitioner. Monte Carlo simulations are conducted to compare the finite sample performance of the new method with those delivered by the normal approximation and the block bootstrap approach.

     
    more » « less
  2. Background

    Metamodels can address some of the limitations of complex simulation models by formulating a mathematical relationship between input parameters and simulation model outcomes. Our objective was to develop and compare the performance of a machine learning (ML)–based metamodel against a conventional metamodeling approach in replicating the findings of a complex simulation model.

    Methods

    We constructed 3 ML-based metamodels using random forest, support vector regression, and artificial neural networks and a linear regression-based metamodel from a previously validated microsimulation model of the natural history hepatitis C virus (HCV) consisting of 40 input parameters. Outcomes of interest included societal costs and quality-adjusted life-years (QALYs), the incremental cost-effectiveness (ICER) of HCV treatment versus no treatment, cost-effectiveness analysis curve (CEAC), and expected value of perfect information (EVPI). We evaluated metamodel performance using root mean squared error (RMSE) and Pearson’s R2on the normalized data.

    Results

    The R2values for the linear regression metamodel for QALYs without treatment, QALYs with treatment, societal cost without treatment, societal cost with treatment, and ICER were 0.92, 0.98, 0.85, 0.92, and 0.60, respectively. The corresponding R2values for our ML-based metamodels were 0.96, 0.97, 0.90, 0.95, and 0.49 for support vector regression; 0.99, 0.83, 0.99, 0.99, and 0.82 for artificial neural network; and 0.99, 0.99, 0.99, 0.99, and 0.98 for random forest. Similar trends were observed for RMSE. The CEAC and EVPI curves produced by the random forest metamodel matched the results of the simulation output more closely than the linear regression metamodel.

    Conclusions

    ML-based metamodels generally outperformed traditional linear regression metamodels at replicating results from complex simulation models, with random forest metamodels performing best.

    Highlights

    Decision-analytic models are frequently used by policy makers and other stakeholders to assess the impact of new medical technologies and interventions. However, complex models can impose limitations on conducting probabilistic sensitivity analysis and value-of-information analysis, and may not be suitable for developing online decision-support tools. Metamodels, which accurately formulate a mathematical relationship between input parameters and model outcomes, can replicate complex simulation models and address the above limitation. The machine learning–based random forest model can outperform linear regression in replicating the findings of a complex simulation model. Such a metamodel can be used for conducting cost-effectiveness and value-of-information analyses or developing online decision support tools.

     
    more » « less
  3. One of the top priorities in observational astronomy is the direct imaging and characterization of extrasolar planets (exoplanets) and planetary systems. Direct images of rocky exoplanets are of particular interest in the search for life beyond the Earth, but they tend to be rather challenging targets since they are orders-of-magnitude dimmer than their host stars and are separated by small angular distances that are comparable to the classicalλ<#comment/>/Ddiffraction limit, even for the coming generation of 30 m class telescopes. Current and planned efforts for ground-based direct imaging of exoplanets combine high-order adaptive optics (AO) with a stellar coronagraph observing at wavelengths ranging from the visible to the mid-IR. The primary barrier to achieving high contrast with current direct imaging methods is quasi-static speckles, caused largely by non-common path aberrations (NCPAs) in the coronagraph optical train. Recent work has demonstrated that millisecond imaging, which effectively “freezes” the atmosphere’s turbulent phase screens, should allow the wavefront sensor (WFS) telemetry to be used as a probe of the optical system to measure NCPAs. Starting with a realistic model of a telescope with an AO system and a stellar coronagraph, this paper provides simulations of several closely related regression models that take advantage of millisecond telemetry from the WFS and coronagraph’s science camera. The simplest regression model, called the naïve estimator, does not treat the noise and other sources of information loss in the WFS. Despite its flaws, in one of the simulations presented herein, the naïve estimator provides a useful estimate of an NCPA of∼<#comment/>0.5radian RMS (≈<#comment/>λ<#comment/>/13), with an accuracy of∼<#comment/>0.06radian RMS in 1 min of simulated sky time on a magnitude 8 star. Thebias-corrected estimatorgeneralizes the regression model to account for the noise and information loss in the WFS. A simulation of the bias-corrected estimator with 4 min of sky time included an NCPA of∼<#comment/>0.05radian RMS (≈<#comment/>λ<#comment/>/130) and an extended exoplanet scene. The joint regression of the bias-corrected estimator simultaneously achieved an NCPA estimate with an accuracy of∼<#comment/>5×<#comment/>10−<#comment/>3radian RMS and an estimate of the exoplanet scene that was free of the self-subtraction artifacts typically associated with differential imaging. The5σ<#comment/>contrast achieved by imaging of the exoplanet scene was∼<#comment/>1.7×<#comment/>10−<#comment/>4at a distance of3λ<#comment/>/Dfrom the star and∼<#comment/>2.1×<#comment/>10−<#comment/>5at10λ<#comment/>/D. These contrast values are comparable to the very best on-sky results obtained from multi-wavelength observations that employ both angular differential imaging (ADI) and spectral differential imaging (SDI). This comparable performance is despite the fact that our simulations are quasi-monochromatic, which makes SDI impossible, nor do they have diurnal field rotation, which makes ADI impossible. The error covariance matrix of the joint regression shows substantial correlations in the exoplanet and NCPA estimation errors, indicating that exoplanet intensity and NCPA need to be estimated self-consistently to achieve high contrast.

     
    more » « less
  4. Abstract An electrodynamic levitation thermal-gradient diffusion chamber was used to grow 268 individual, small ice particles (initial radii of 8–26 μ m) from the vapor, at temperatures ranging from −65° to −40°C, and supersaturations up to liquid saturation. Growth limited by attachment kinetics was frequently measured at low supersaturation, as shown in prior work. At high supersaturation, enhanced growth was measured, likely due to the development of branches and hollowed facets. The effects of branching and hollowing on particle growth are often treated with an effective density ρ eff . We fit the measured time series with two different models to estimate size-dependent ρ eff values: the first model decreases ρ eff to an asymptotic deposition density ρ dep , and the second models ρ eff by a power law with exponent P . Both methods produce similar results, though the fits with ρ dep typically have lower relative errors. The fit results do not correspond well with models of isometric or planar single-crystalline growth. While single-crystalline columnar crystals correspond to some of the highest growth rates, a newly constructed geometric model of budding rosette crystals produces the best match with the growth data. The relative frequency of occurrence of ρ dep and P values show a clear dependence on ice supersaturation normalized to liquid saturation. We use these relative frequencies of ρ dep and P to derive two supersaturation-dependent mass–size relationships suitable for cloud modeling applications. 
    more » « less
  5. null (Ed.)
    Aims. We study the relative helicity of active region (AR) NOAA 12673 during a ten-hour time interval centered around a preceding X2.2 flare (SOL2017-09-06T08:57) and also including an eruptive X9.3 flare that occurred three hours later (SOL2017-09-06T11:53). In particular, we aim for a reliable estimate of the normalized self-helicity of the current-carrying magnetic field, the so-called helicity ratio, | H J |/| H 𝒱 |, a promising candidate to quantity the eruptive potential of solar ARs. Methods. Using Solar Dynamics Observatory Helioseismic and Magnetic Imager vector magnetic field data as an input, we employ nonlinear force-free (NLFF) coronal magnetic field models using an optimization approach. The corresponding relative helicity, and related quantities, are computed using a finite-volume method. From multiple time series of NLFF models based on different choices of free model parameters, we are able to assess the spread of | H J |/| H 𝒱 |, and to estimate its uncertainty. Results. In comparison to earlier works, which identified the non-solenoidal contribution to the total magnetic energy, E div / E , as selection criterion regarding the required solenoidal quality of magnetic field models for subsequent relative helicity analysis, we propose to use in addition the non-solenoidal contribution to the free magnetic energy, | E mix |/ E J , s . As a recipe for a reliable estimate of the relative magnetic helicity (and related quantities), we recommend to employ multiple NLFF models based on different combinations of free model parameters, to retain only those that exhibit smallest values of both E div / E and | E mix |/ E J , s at a certain time instant, to subsequently compute mean estimates, and to use the spread of the individually contributing values as an indication for the uncertainty. 
    more » « less