We propose new tests for assessing whether covariates in a treatment group and matched control group are balanced in observational studies. The tests exhibit high power under a wide range of multivariate alternatives, some of which existing tests have little power for. The asymptotic permutation null distributions of the proposed tests are studied and the
We evaluate the validity of a projection‐based test checking linear models when the number of covariates tends to infinity, and analyze two gene expression datasets. We show that the test is still consistent and derive the asymptotic distributions under the null and alternative hypotheses. The asymptotic properties are almost the same as those when the number of covariates is fixed as long as
- NSF-PAR ID:
- 10453598
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Statistics in Medicine
- Volume:
- 40
- Issue:
- 13
- ISSN:
- 0277-6715
- Page Range / eLocation ID:
- p. 3153-3166
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract P ‐values calculated through the asymptotic results work well in simulation studies, facilitating the application of the test to large data sets. The tests are illustrated in a study of the effect of smoking on blood lead levels. The proposed tests are implemented in anR packageBalanceCheck . -
Abstract Aim To test the latitudinal gradient in plant species diversity for self‐similarity across taxonomic scales and amongst taxa.
Location North America.
Methods We used species richness data from 245 local vascular plant floras to quantify the slope and shape of the latitudinal gradients in species diversity (
LGSD ) across all plant species as well as within each family and order. We calculated the contribution of each family and order to the empiricalLGSD .Results We observed the canonical
LGSD when all plants were considered with floras at the lowest latitudes having, on average, 451 more species than floras at the highest latitudes. When considering slope alone, most orders and families showed the expected negative slope, but 31.7% of families and 27.7% of orders showed either no significant relationship between latitude and diversity or a reverseLGSD . Latitudinal patterns of family diversity account for at least 14% of thisLGSD . Most orders and families did not show the negative slope and concave‐down quadratic shape expected by the pattern for all plant species. A majority of families did not make a significant contribution in species to theLGSD with 53% of plant families contributing little to nothing to the overall gradient. Ten families accounted for more than 70% of the gradient. Two families, the Asteraceae and Fabaceae, contributed a third of theLGSD .Main Conclusions The empirical
LGSD we describe here is a consequence of a gradient in the number of families and diversification within relative few plant families. Macroecological studies typically aim to generate models that are general across taxa with the implicit assumption that the models are general within taxa. Our results strongly suggest that models of the latitudinal gradient in plant species richness that rely on environmental covariates (e.g. temperature, energy) are likely not general across plant taxa. -
Abstract The time compression (or time condensation) approximation (TCA) is commonly used in conjunction with an infiltration capacity equation for predicting the postponding infiltration rate, or, more generally, infiltration under time‐varying precipitation. In this paper a power function relationship for TCA between infiltration capacity and its time derivative is proposed for infiltration in the presence of a shallow water table. The results show that the exponent (
) in the power function relationship is not a constant but decreases as infiltration proceeds. The change ofβ indicates that the TCA relationship changes during infiltration and further suggests the necessity of using different TCA relationships for predicting infiltration rate during different stages after ponding. We argue that the change ofβ is due to the gradual dynamic change of the relative role of gravity and capillarity during infiltration. A Péclet number (β ) is proposed for measuring the relative effect of gravity and capillarity. In the early times of infiltration whenPe , with the increase ofPe < 1Pe , decreases roughly from 3.5 to 2 for clay, silty clay loam, and silty loam, and from 3 to 2 for sandy loam and sand; during the longer times whenβ ,Pe > 1 has a linear relationship withβ . The relationship betweenPe andPe provides an objective approach to select the suitable TCA function during different infiltration stages after ponding.β -
Abstract Aim This paper assesses the relative importance of environmental filtering and dispersal limitations as controls on the western range limit of
Fagus grandifolia , a common mesic late‐successional tree species in the easternUnited States . We also test for differences in species–environment relationships between range‐edge populations ofF. grandifolia in eastern Wisconsin and core populations in Michigan. Because environmental conditions between the states differ moderately, while in Michigan dispersal presumably no longer limitsF. grandifolia distributions,F. grandifolia offers a classic case study for biogeographers, foresters, and palaeoecologists interested in understanding processes governing species range limits.Location Wisconsin and Michigan,
USA .Taxon Fagus grandifolia .Methods This study combines historical datasets of
F. grandifolia from the Public Land Survey, environmental covariates from soil maps and historical climate data, three spatial scenarios of dispersal limitation, and five species distribution models (SDM s). We test dispersal limitation and environmental filtering hypotheses by assessingSDM transferability between core and edge populations, measuring the importance of dispersal and environmental predictors, and using a residual autocovariate model to test for spatial processes not represented by these predictors.Results Fagus grandifolia presence was best predicted by total snowfall in Michigan and by dispersal, summer precipitation, and potential evapotranspiration (PET) in Wisconsin. Following the addition of dispersal as a predictor, most Wisconsin models improved and spatial autocorrelation effects largely disappeared. Transferability between core and edge populations was moderate to low.Main conclusions Both environmental and dispersal limitations appear to govern the western range limit of
F. grandifolia . Species–environment relationships differ between range‐edge and core populations, suggesting either stronger environmental filtering at the range edge or fine‐scale, spatially varying interactions between environmental factors governing moisture availability in core populations. Although lakes, like Lake Michigan, both moderate regional climates and act as dispersal barriers, these effects can be disentangled through the joint analysis ofSDM s and historic observational datasets. -
Abstract We focus on the all‐pairs minimum cut (APMC) problem, a graph partitioning problem whose solution requires finding the minimum cut for every pair of nodes in a given graph. While it is solved for undirected graphs, a solution for APMC in directed graphs still requires an
brute force approach. We show that the empirical number of distinct minimum cuts in randomly generated strongly connected directed graphs is proportional toO (n 2)n rather than the theoretical value of , suggesting the possibility of an algorithm which finds all minimum cuts in less thann 2 time. We also provide an example of the strict upper bound on the number of cuts in graphs with three nodes. We model the distributions with the Generalized extreme value (GEV) distribution and enable the possibility of using a GEV distribution to predict the probability of achieving a certain number of minimum cuts, given the number of nodes and edges. Finally, we contribute to the notion of symmetric cuts by showing that there can beO (n 2) symmetric cuts in graphs when node replication is allowed.O (n 2)