skip to main content

Search for: All records

Creators/Authors contains: "Wang, Xinyue"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Nikolski, Macha (Ed.)
    Abstract Motivation

    Genome-wide association studies (GWAS) benefit from the increasing availability of genomic data and cross-institution collaborations. However, sharing data across institutional boundaries jeopardizes medical data confidentiality and patient privacy. While modern cryptographic techniques provide formal secure guarantees, the substantial communication and computational overheads hinder the practical application of large-scale collaborative GWAS.


    This work introduces an efficient framework for conducting collaborative GWAS on distributed datasets, maintaining data privacy without compromising the accuracy of the results. We propose a novel two-step strategy aimed at reducing communication and computational overheads, and we employ iterative and sampling techniques to ensure accurate results. We instantiate our approach using logistic regression, a commonly used statistical method for identifying associations between genetic markers and the phenotype of interest. We evaluate our proposed methods using two real genomic datasets and demonstrate their robustness in the presence of between-study heterogeneity and skewed phenotype distributions using a variety of experimental settings. The empirical results show the efficiency and applicability of the proposed method and the promise for its application for large-scale collaborative GWAS.

    Availability and implementation

    The source code and data are available at

    more » « less
    Free, publicly-accessible full text available October 1, 2024
  2. Abstract

    To understand diurnal variations in PM2.5composition and aerosol extract absorption, PM2.5samples were collected at intervals of 2 hr from 8:00 to 20:00 and 6 hr from 20:00 to 8:00 (the next day) in northern Nanjing, China, during the winter and summer of 2019–2020 and analyzed for bulk components, organic tracers, and light absorption of water and methanol extracts—a proxy measure of brown carbon (BrC). Diurnal patterns of measured species reflected the influences of primary emissions and atmospheric processes. Light absorption coefficients of water (Abs365,w) and methanol extracts (Abs365,m) at 365 nm shared a similar diurnal profile peaking at 18:00–20:00, generally following changes in biomass burning tracers. However, Abs365,w, Abs365,m, and their normalizations to organic aerosols increased at 14:00–16:00, earlier than that of levoglucosan in the late afternoon, which was attributed to secondarily formed BrC. The methanol extracts showed a less drastic decrease in light absorption at night than the water extracts and elevated absorption efficiency during 2:00–8:00. This is due to the fact that the water‐insoluble OC has a longer lifetime and stronger light absorption than the water‐soluble OC. According to the source apportionment results solved by positive matrix factorization (PMF), biomass burning and secondary formation were the major BrC sources in northern Nanjing, with an average total relative contribution of about 90%. Compared to previous studies, diurnal source cycles were added to the PMF simulations in this work by using time‐resolved speciation data, which avoided misclassification of BrC sources.

    more » « less
    Free, publicly-accessible full text available September 27, 2024
  3. Abstract Structural health monitoring (SHM) is the automation of the condition assessment process of an engineered system. When applied to geometrically large components or structures, such as those found in civil and aerospace infrastructure and systems, a critical challenge is in designing the sensing solution that could yield actionable information. This is a difficult task to conduct cost-effectively, because of the large surfaces under consideration and the localized nature of typical defects and damages. There have been significant research efforts in empowering conventional measurement technologies for applications to SHM in order to improve performance of the condition assessment process. Yet, the field implementation of these SHM solutions is still in its infancy, attributable to various economic and technical challenges. The objective of this Roadmap publication is to discuss modern measurement technologies that were developed for SHM purposes, along with their associated challenges and opportunities, and to provide a path to research and development efforts that could yield impactful field applications. The Roadmap is organized into four sections: distributed embedded sensing systems, distributed surface sensing systems, multifunctional materials, and remote sensing. Recognizing that many measurement technologies may overlap between sections, we define distributed sensing solutions as those that involve or imply the utilization of numbers of sensors geometrically organized within (embedded) or over (surface) the monitored component or system. Multi-functional materials are sensing solutions that combine multiple capabilities, for example those also serving structural functions. Remote sensing are solutions that are contactless, for example cell phones, drones, and satellites. It also includes the notion of remotely controlled robots. 
    more » « less
  4. Abstract

    The Hunga Tonga Hunga‐Ha'apai (HTHH) volcanic eruption on 15 January 2022 injected water vapor and SO2into the stratosphere. Several months after the eruption, significantly stronger westerlies, and a weaker Brewer‐Dobson circulation developed in the stratosphere of the Southern Hemisphere and were accompanied by unprecedented temperature anomalies in the stratosphere and mesosphere. In August 2022, the Sounding of the Atmosphere using Broadband Emission Radiometry (SABER) satellite instrument observed record‐breaking temperature anomalies in the stratosphere and mesosphere that alternate signs with altitude. Ensemble simulations carried out with the Whole Atmosphere Community Climate Model (WACCM6) indicate that the strengthening of the stratospheric westerlies explains the mesospheric temperature changes. The stronger westerlies cause stronger westward gravity wave drag in the mesosphere. Although the enhanced gravity wave drag is partly balanced by a weakening of planetary wave forcing, the net result is an acceleration of the mesospheric mean meridional circulation. The stronger mesospheric circulation, in turn, plays a dominant role in driving the changes in mesospheric temperatures. This study highlights the impact of large volcanic eruptions on middle atmospheric dynamics and provides insight into their long‐term effects in the mesosphere. On the other hand, we could not discern a clear mechanism for the observed changes in stratospheric circulation. In fact, an examination of the WACCM ensemble reveals that not every member reproduces the large changes observed by SABER. We conclude that there is a stochastic component to the stratospheric response to the HTHH eruption.

    more » « less
  5. Abstract

    The stratospheric influence on summertime high surface ozone (O3) events is examined using a twenty-year simulation from the Whole Atmosphere Community Climate Model. We find thatO3transported from the stratosphere makes a significant contribution to the surfaceO3variability where background surfaceO3exceeds the 95thpercentile, especially over western U.S. Maximum covariance analysis is applied toO3anomalies paired with stratosphericO3tracer anomalies to identify the stratospheric intrusion and the underlying dynamical mechanism. The first leading mode corresponds to deep stratospheric intrusions in the western and northern tier of the U.S., and intensified northeasterlies in the mid-to-lower troposphere along the west coast, which also facilitate the transport to the eastern Pacific Ocean. The second leading mode corresponds to deep intrusions over the Intermountain Regions. Both modes are associated with eastward propagating baroclinic systems, which are amplified near the end of the North Pacific storm tracks, leading to strong descents over the western U.S.

    more » « less
  6. null (Ed.)
    The Tweet Collection Management (TWT) Team aims to ingest 5 billion tweets, clean this data, analyze the metadata present, extract key information, classify tweets into categories, and finally, index these tweets into Elasticsearch to browse and query. The main deliverable of this project is a running software application for searching tweets and for viewing Twitter collections from Digital Library Research Laboratory (DLRL) event archive projects. As a starting point, we focused on two development goals: (1) hashtag-based and (2) username-based search for tweets. For IR1, we completed extraction of two fields within our sample collection: hashtags and username. Sample code for TwiRole, a user-classification program, was investigated for use in our project. We were able to sample from multiple collections of tweets, spanning topics like COVID-19 and hurricanes. Initial work encompassed using a sample collection, provided via Google Drive. An NFS-based persistent storage was later involved to allow access to larger collections. In total, we have developed 9 services to extract key information like username, hashtags, geo-location, and keywords from tweets. We have also developed services to allow for parsing and cleaning of raw API data, and backup of data in an Apache Parquet filestore. All services are Dockerized and added to the GitLab Container Registry. The services are deployed in the CS cloud cluster to integrate services into the full search engine workflow. A service is created to convert WARC files to JSON for reading archive files into the application. Unit testing of services is complete and end-to-end tests have been conducted to improve system robustness and avoid failure during deployment. The TWT team has indexed 3,200 tweets into the Elasticsearch index. Future work could involve parallelization of the extraction of metadata, an alternative feature-flag approach, advanced geo-location inference, and adoption of the DMI-TCAT format. Key deliverables include a data body that allows for search, sort, filter, and visualization of raw tweet collections and metadata analysis; a running software application for searching tweets and for viewing Twitter collections from Digital Library Research Laboratory (DLRL) event archive projects; and a user guide to assist those using the system. 
    more » « less
  7. Abstract

    The Hunga Tonga‐Hunga Ha'apai (HTHH) volcanic eruption in January 2022 injected unprecedented amounts of water vapor (H2O) and a moderate amount of the aerosol precursor sulfur dioxide (SO2) into the Southern Hemisphere (SH) tropical stratosphere. The H2O and aerosol perturbations have persisted during 2022 and early 2023 and dispersed throughout the atmosphere. Observations show large‐scale SH stratospheric cooling, equatorward shift of the Antarctic polar vortex and slowing of the Brewer‐Dobson circulation. Satellite observations show substantial ozone reductions over SH winter midlatitudes that coincide with the largest circulation anomalies. Chemistry‐climate model simulations forced by realistic HTHH inputs of H2O and SO2qualitatively reproduce the observed evolution of the H2O and aerosol plumes over the first year, and the model exhibits stratospheric cooling, circulation changes and ozone effects similar to observed behavior. The agreement demonstrates that the observed stratospheric changes are caused by the HTHH volcanic influences.

    more » « less