Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Abstract Haze in Beijing is linked to atmospherically formed secondary organic aerosol, which has been shown to be particularly harmful to human health. However, the sources and formation pathways of these secondary aerosols remain largely unknown, hindering effective pollution mitigation. Here we have quantified the sources of organic aerosol via direct near-molecular observations in central Beijing. In winter, organic aerosol pollution arises mainly from fresh solid-fuel emissions and secondary organic aerosols originating from both solid-fuel combustion and aqueous processes, probably involving multiphase chemistry with aromatic compounds. The most severe haze is linked to secondary organic aerosols originating from solid-fuel combustion, transported from the Beijing–Tianjing–Hebei Plain and rural mountainous areas west of Beijing. In summer, the increased fraction of secondary organic aerosol is dominated by aromatic emissions from the Xi’an–Shanghai–Beijing region, while the contribution of biogenic emissions remains relatively small. Overall, we identify the main sources of secondary organic aerosol affecting Beijing, which clearly extend beyond the local emissions in Beijing. Our results suggest that targeting key organic precursor emission sectors regionally may be needed to effectively mitigate organic aerosol pollution.
Free, publicly-accessible full text available August 1, 2025 -
A private learner is trained on a sample of labeled points and generates a hypothesis that can be used for predicting the labels of newly sampled points while protecting the privacy of the training set [Kasiviswannathan et al., FOCS 2008]. Past research uncovered that private learners may need to exhibit significantly higher sample complexity than non-private learners as is the case of learning of one-dimensional threshold functions [Bun et al., FOCS 2015, Alon et al., STOC 2019]. We explore prediction as an alternative to learning. A predictor answers a stream of classification queries instead of outputting a hypothesis. Earlier work has considered a private prediction model with a single classification query [Dwork and Feldman, COLT 2018]. We observe that when answering a stream of queries, a predictor must modify the hypothesis it uses over time, and in a manner that cannot rely solely on the training set. We introduce private everlasting prediction taking into account the privacy of both the training set and the (adaptively chosen) queries made to the predictor. We then present a generic construction of private everlasting predictors in the PAC model. The sample complexity of the initial training sample in our construction is quadratic (up to polylog factors) in the VC dimension of the concept class. Our construction allows prediction for all concept classes with finite VC dimension, and in particular threshold functions over infinite domains, for which (traditional) private learning is known to be impossible.more » « less
Abstract A key challenge in aerosol pollution studies and climate change assessment is to understand how atmospheric aerosol particles are initially formed1,2. Although new particle formation (NPF) mechanisms have been described at specific sites3–6, in most regions, such mechanisms remain uncertain to a large extent because of the limited ability of atmospheric models to simulate critical NPF processes1,7. Here we synthesize molecular-level experiments to develop comprehensive representations of 11 NPF mechanisms and the complex chemical transformation of precursor gases in a fully coupled global climate model. Combined simulations and observations show that the dominant NPF mechanisms are distinct worldwide and vary with region and altitude. Previously neglected or underrepresented mechanisms involving organics, amines, iodine oxoacids and HNO3probably dominate NPF in most regions with high concentrations of aerosols or large aerosol radiative forcing; such regions include oceanic and human-polluted continental boundary layers, as well as the upper troposphere over rainforests and Asian monsoon regions. These underrepresented mechanisms also play notable roles in other areas, such as the upper troposphere of the Pacific and Atlantic oceans. Accordingly, NPF accounts for different fractions (10–80%) of the nuclei on which cloud forms at 0.5% supersaturation over various regions in the lower troposphere. The comprehensive simulation of global NPF mechanisms can help improve estimation and source attribution of the climate effects of aerosols.
Free, publicly-accessible full text available July 4, 2025 -
Abstract Objectives Racial disparities in kidney transplant access and posttransplant outcomes exist between non-Hispanic Black (NHB) and non-Hispanic White (NHW) patients in the United States, with the site of care being a key contributor. Using multi-site data to examine the effect of site of care on racial disparities, the key challenge is the dilemma in sharing patient-level data due to regulations for protecting patients’ privacy.
Materials and Methods We developed a federated learning framework, named dGEM-disparity (decentralized algorithm for Generalized linear mixed Effect Model for disparity quantification). Consisting of 2 modules, dGEM-disparity first provides accurately estimated common effects and calibrated hospital-specific effects by requiring only aggregated data from each center and then adopts a counterfactual modeling approach to assess whether the graft failure rates differ if NHB patients had been admitted at transplant centers in the same distribution as NHW patients were admitted.
Results Utilizing United States Renal Data System data from 39 043 adult patients across 73 transplant centers over 10 years, we found that if NHB patients had followed the distribution of NHW patients in admissions, there would be 38 fewer deaths or graft failures per 10 000 NHB patients (95% CI, 35-40) within 1 year of receiving a kidney transplant on average.
Discussion The proposed framework facilitates efficient collaborations in clinical research networks. Additionally, the framework, by using counterfactual modeling to calculate the event rate, allows us to investigate contributions to racial disparities that may occur at the level of site of care.
Conclusions Our framework is broadly applicable to other decentralized datasets and disparities research related to differential access to care. Ultimately, our proposed framework will advance equity in human health by identifying and addressing hospital-level racial disparities.
Gas-phase oxygenated organic molecules (OOMs) can contribute significantly to both atmospheric new particle growth and secondary organic aerosol formation. Precursor apportionment of atmospheric OOMs connects them with volatile organic compounds (VOCs). Since atmospheric OOMs are often highly functionalized products of multistep reactions, it is challenging to reveal the complete mapping relationships between OOMs and their precursors. In this study, we demonstrate that the machine learning method is useful in attributing atmospheric OOMs to their precursors using several chemical indicators, such as O/C ratio and H/C ratio. The model is trained and tested using data acquired in controlled laboratory experiments, covering the oxidation products of four main types of VOCs (isoprene, monoterpenes, aliphatics, and aromatics). Then, the model is used for analyzing atmospheric OOMs measured in both urban Beijing and a boreal forest environment in southern Finland. The results suggest that atmospheric OOMs in these two environments can be reasonably assigned to their precursors. Beijing is an anthropogenic VOC dominated environment with ∼64% aromatic and aliphatic OOMs, and the other boreal forested area has ∼76% monoterpene OOMs. This pilot study shows that machine learning can be a promising tool in atmospheric chemistry for connecting the dots.more » « less
Abstract Objective Supporting public health research and the public’s situational awareness during a pandemic requires continuous dissemination of infectious disease surveillance data. Legislation, such as the Health Insurance Portability and Accountability Act of 1996 and recent state-level regulations, permits sharing deidentified person-level data; however, current deidentification approaches are limited. Namely, they are inefficient, relying on retrospective disclosure risk assessments, and do not flex with changes in infection rates or population demographics over time. In this paper, we introduce a framework to dynamically adapt deidentification for near-real time sharing of person-level surveillance data. Materials and Methods The framework leverages a simulation mechanism, capable of application at any geographic level, to forecast the reidentification risk of sharing the data under a wide range of generalization policies. The estimates inform weekly, prospective policy selection to maintain the proportion of records corresponding to a group size less than 11 (PK11) at or below 0.1. Fixing the policy at the start of each week facilitates timely dataset updates and supports sharing granular date information. We use August 2020 through October 2021 case data from Johns Hopkins University and the Centers for Disease Control and Prevention to demonstrate the framework’s effectiveness in maintaining the PK11 threshold of 0.01. Results When sharing COVID-19 county-level case data across all US counties, the framework’s approach meets the threshold for 96.2% of daily data releases, while a policy based on current deidentification techniques meets the threshold for 32.3%. Conclusion Periodically adapting the data publication policies preserves privacy while enhancing public health utility through timely updates and sharing epidemiologically critical features.more » « less