skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Peng, Wei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Chen, Yi-Hau ; Stufken, John ; Judy_Wang, Huixia (Ed.)
    Though introduced nearly 50 years ago, the infinitesimal jackknife (IJ) remains a popular modern tool for quantifying predictive uncertainty in complex estimation settings. In particular, when supervised learning ensembles are constructed via bootstrap samples, recent work demonstrated that the IJ estimate of variance is particularly convenient and useful. However, despite the algebraic simplicity of its final form, its derivation is rather complex. As a result, studies clarifying the intuition behind the estimator or rigorously investigating its properties have been severely lacking. This work aims to take a step forward on both fronts. We demonstrate that surprisingly, the exact form of the IJ estimator can be obtained via a straightforward linear regression of the individual bootstrap estimates on their respective weights or via the classical jackknife. The latter realization allows us to formally investigate the bias of the IJ variance estimator and better characterize the settings in which its use is appropriate. Finally, we extend these results to the case of U-statistics where base models are constructed via subsampling rather than bootstrapping and provide a consistent estimate of the resulting variance. 
    more » « less
    Free, publicly-accessible full text available July 1, 2025
  2. Abstract

    Scaling up electric vehicles (EVs) provides an avenue to mitigate both carbon emissions and air pollution from road transport. The benefits of EV adoption for climate, air quality, and health have been widely documented. Yet, evidence on the distribution of these impacts has not been systematically reviewed, despite its central importance to ensure a just and equitable transition. Here, we perform a systematic review of recent EV studies that have examined the spatial distribution of the emissions, air pollution, and health impacts, as an important aspect of the equity implications. Using the Context-Interventions-Mechanisms-Outcome framework with a two-step search strategy, we narrowed down to 47 papers that met our inclusion criteria for detailed review and synthesis. We identified two key factors that have been found to influence spatial distributions. First, the cross-sectoral linkages may result in unintended impacts elsewhere. For instance, the generation of electricity to charge EVs, and the production of batteries and other materials to manufacture EVs could increase the emissions and pollution in locations other than where EVs are adopted. Second, since air pollution and health are local issues, additional location-specific factors may play a role in determining the spatial distribution, such as the wind transport of pollution, and the size and vulnerability of the exposed populations. Based on our synthesis of existing evidence, we highlight two important areas for further research: (1) fine-scale pollution and health impact assessment to better characterize exposure and health disparities across regions and population groups; and (2) a systematic representation of the EV value chain that captures the linkages between the transport, power and manufacturing sectors as well as the regionally-varying activities and impacts.

     
    more » « less
  3. Allen, Genevra (Ed.)
    Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods. While their black-box nature has made their mathematical analysis difficult, recent work has established important statistical properties like consistency and asymptotic normality by considering subsampling in lieu of bootstrapping. Though such results open the door to traditional inference procedures, all formal methods suggested thus far place severe restrictions on the testing framework and their computational overhead often precludes their practical scientific use. Here we propose a hypothesis test to formally assess feature significance, which uses permutation tests to circumvent computationally infeasible estimates of nuisance parameters. This test is intended to be analogous to the F-test for linear regression. We establish asymptotic validity of the test via exchangeability arguments and show that the test maintains high power with orders of magnitude fewer computations. Importantly, the procedure scales easily to big data settings where large training and testing sets may be employed, conducting statistically valid inference without the need to construct additional models. Simulations and applications to ecological data, where random forests have recently shown promise, are provided. 
    more » « less