skip to main content


Search for: All records

Creators/Authors contains: "Jiang, X."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. M. B. Goldwater ; F. K. Anggoro ; B. K. Hayes ; D. C. Ong (Ed.)
    Free, publicly-accessible full text available July 25, 2024
  2. Free, publicly-accessible full text available June 1, 2024
  3. The process of matching patients with suitable clinical trials is essential for advancing medical research and providing optimal care. However, current approaches face challenges such as data standardization, ethical considerations, and a lack of interoperability between Electronic Health Records (EHRs) and clinical trial criteria. In this paper, we explore the potential of large language models (LLMs) to address these challenges by leveraging their advanced natural language generation capabilities to improve compatibility between EHRs and clinical trial descriptions. We propose an innovative privacy-aware data augmentation approach for LLM-based patient-trial matching (LLM-PTM), which balances the benefits of LLMs while ensuring the security and confidentiality of sensitive patient data. Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%. Additionally, we present case studies to further illustrate the effectiveness of our approach and provide a deeper understanding of its underlying principles. 
    more » « less
  4. Kinship relationship estimation plays a significant role in today's genome studies. Since genetic data are mostly stored and protected in different silos, retrieving the desirable kinship relationships across federated data warehouses is a non-trivial problem. The ability to identify and connect related individuals is important for both research and clinical applications. In this work, we propose a new privacy-preserving kinship relationship estimation framework: Incremental Update Kinship Identification (INK). The proposed framework includes three key components that allow us to control the balance between privacy and accuracy (of kinship estimation): an incremental process coupled with the use of auxiliary information and informative scores. Our empirical evaluation shows that INK can achieve higher kinship identification correctness while exposing fewer genetic markers. 
    more » « less
  5. null (Ed.)
    Input uncertainty is an aspect of simulation model risk that arises when the driving input distributions are derived or “fit” to real-world, historical data. Although there has been significant progress on quantifying and hedging against input uncertainty, there has been no direct attempt to reduce it via better input modeling. The meaning of “better” depends on the context and the objective: Our context is when (a) there are one or more families of parametric distributions that are plausible choices; (b) the real-world historical data are not expected to perfectly conform to any of them; and (c) our primary goal is to obtain higher-fidelity simulation output rather than to discover the “true” distribution. In this paper, we show that frequentist model averaging can be an effective way to create input models that better represent the true, unknown input distribution, thereby reducing model risk. Input model averaging builds from standard input modeling practice, is not computationally burdensome, requires no change in how the simulation is executed nor any follow-up experiments, and is available on the Comprehensive R Archive Network (CRAN). We provide theoretical and empirical support for our approach. 
    more » « less