skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Confidence regions for the location of response surface optima: the R package OptimaRegion
Statistical inference on the location of the optima (global maxima or minima) is one of the main goals in the area of Response Surface Methodology, with many applications in engineering and science. While there exist previous methods for computing confidence regions on the location of optima, these are for linear models based on a Normal distribution assumption, and do not address specifically the difficulties associated with guaranteeing global optimality. This paper describes distribution-free methods for the computation of confidence regions on the location of the global optima of response surface models. The methods are based on bootstrapping and Tukey's data depth, and therefore their performance does not rely on distributional assumptions about the errors affecting the response. An R language implementation, the package \code{OptimaRegion}, is described. Both parametric (quadratic and cubic polynomials in up to 5 covariates) and nonparametric models (thin plate splines in 2 covariates) are supported. A coverage analysis is presented demonstrating the quality of the regions found. The package also contains an R implementation of the Gloptipoly algorithm for the global optimization of polynomial responses subject to bounds.  more » « less
Award ID(s):
1634878
PAR ID:
10029358
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Communications in Statistics - Simulation and Computation
ISSN:
0361-0918
Page Range / eLocation ID:
1 to 21
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mireji, Paul O (Ed.)
    Mosquito vectors of pathogens (e.g.,Aedes,Anopheles, andCulexspp. which transmit dengue, Zika, chikungunya, West Nile, malaria, and others) are of increasing concern for global public health. These vectors are geographically shifting under climate and other anthropogenic changes. As small-bodied ectotherms, mosquitoes are strongly affected by temperature, which causes unimodal responses in mosquito life history traits (e.g., biting rate, adult mortality rate, mosquito development rate, and probability of egg-to-adult survival) that exhibit upper and lower thermal limits and intermediate thermal optima in laboratory studies. However, it remains unknown how mosquito thermal responses measured in laboratory experiments relate to the realized thermal responses of mosquitoes in the field. To address this gap, we leverage thousands of global mosquito occurrences and geospatial satellite data at high spatial resolution to construct machine-learning based species distribution models, from which vector thermal responses are estimated. We apply methods to restrict models to the relevant mosquito activity season and to conduct ecologically plausible spatial background sampling centered around ecoregions for comparison to mosquito occurrence records. We found that thermal minima estimated from laboratory studies were highly correlated with those from the species distributions (r = 0.87). The thermal optima were less strongly correlated (r = 0.69). For most species, we did not detect thermal maxima from their observed distributions so were unable to compare to laboratory-based estimates. The results suggest that laboratory studies have the potential to be highly transportable to predicting lower thermal limits and thermal optima of mosquitoes in the field. At the same time, lab-based models likely capture physiological limits on mosquito persistence at high temperatures that are not apparent from field-based observational studies but may critically determine mosquito responses to climate warming. Our results indicate that lab-based and field-based studies are highly complementary; performing the analyses in concert can help to more comprehensively understand vector response to climate change. 
    more » « less
  2. Abstract In land surface models (LSMs), the hydraulic properties of the subsurface are commonly estimated according to the texture of soils at the Earth's surface. This approach ignores macropores, fracture flow, heterogeneity, and the effects of variable distribution of water in the subsurface oneffectivewatershed‐scale hydraulic variables. Using hydrograph recession analysis, we empirically constrain estimates of watershed‐scale effective hydraulic conductivities (K) and effective drainable aquifer storages (S) of all reference watersheds in the conterminous United States for which sufficient streamflow data are available (n = 1,561). Then, we use machine learning methods to model these properties across the entire conterminous United States. Model validation results in high confidence for estimates of log(K) (r2 > 0.89; 1% < bias < 9%) and reasonable confidence forS(r2 > 0.83; −70% < bias < −18%). Our estimates of effectiveKare, on average, two orders of magnitude higher than comparable soil‐texture‐based estimates of averageK, confirming the importance of soil structure and preferential flow pathways at the watershed scale. Our estimates of effectiveScompare favorably with recent global estimates of mobile groundwater and are spatially heterogeneous (5–3,355 mm). Because estimates ofSare much lower than the global maximums generally used in LSMs (e.g., 5,000 mm in Noah‐MP), they may serve both to limit model spin‐up time and to constrain model parameters to more realistic values. These results represent the first attempt to constrain estimates of watershed‐scale effective hydraulic variables that are necessary for the implementation of LSMs for the entire conterminous United States. 
    more » « less
  3. We consider inference for the parameters of a linear model when the covariates are random and the relationship between response and covariates is possibly non-linear. Conventional inference methods such as z intervals perform poorly in these cases. We propose a double bootstrap-based calibrated percentile method, perc-cal, as a general-purpose CI method which performs very well relative to alternative methods in challenging situations such as these. The superior performance of perc-cal is demonstrated by a thorough, full-factorial design synthetic data study as well as a data example involving the length of criminal sentences. We also provide theoretical justification for the perc-cal method under mild conditions. The method is implemented in the R package "perccal", available through CRAN and coded primarily in C++, to make it easier for practitioners to use. 
    more » « less
  4. Abstract Adaptive multiple testing with covariates is an important research direction that has gained major attention in recent years. It has been widely recognised that leveraging side information provided by auxiliary covariates can improve the power of false discovery rate (FDR) procedures. Currently, most such procedures are devised with p-values as their main statistics. However, for two-sided hypotheses, the usual data processing step that transforms the primary statistics, known as p-values, into p-values not only leads to a loss of information carried by the main statistics, but can also undermine the ability of the covariates to assist with the FDR inference. We develop a p-value based covariate-adaptive (ZAP) methodology that operates on the intact structural information encoded jointly by the p-values and covariates. It seeks to emulate the oracle p-value procedure via a working model, and its rejection regions significantly depart from those of the p-value adaptive testing approaches. The key strength of ZAP is that the FDR control is guaranteed with minimal assumptions, even when the working model is misspecified. We demonstrate the state-of-the-art performance of ZAP using both simulated and real data, which shows that the efficiency gain can be substantial in comparison with p-value-based methods. Our methodology is implemented in the R package zap. 
    more » « less
  5. Assessing several individuals intensively over time yields intensive longitudinal data (ILD). Even though ILD provide rich information, they also bring other data analytic challenges. One of these is the increased occurrence of missingness with increased study length, possibly under non-ignorable missingness scenarios. Multiple imputation (MI) handles missing data by creating several imputed data sets, and pooling the estimation results across imputed data sets to yield final estimates for inferential purposes. In this article, we introduce dynr.mi(), a function in the R package, Dynamic Modeling in R (dynr). The package dynr provides a suite of fast and accessible functions for estimating and visualizing the results from fitting linear and nonlinear dynamic systems models in discrete as well as continuous time. By integrating the estimation functions in dynr and the MI procedures available from the R package, Multivariate Imputation by Chained Equations (MICE), the dynr.mi() routine is designed to handle possibly non-ignorable missingness in the dependent variables and/or covariates in a user-specified dynamic systems model via MI, with convergence diagnostic check. We utilized dynr.mi() to examine, in the context of a vector autoregressive model, the relationships among individuals’ ambulatory physiological measures, and self-report affect valence and arousal. The results from MI were compared to those from listwise deletion of entries with missingness in the covariates. When we determined the number of iterations based on the convergence diagnostics available from dynr.mi(), differences in the statistical significance of the covariate parameters were observed between the listwise deletion and MI approaches. These results underscore the importance of considering diagnostic information in the implementation of MI procedures. 
    more » « less