skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Xu, Jason"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ABSTRACT

    Viral deep-sequencing data play a crucial role toward understanding disease transmission network flows, providing higher resolution compared to standard Sanger sequencing. To more fully utilize these rich data and account for the uncertainties in outcomes from phylogenetic analyses, we propose a spatial Poisson process model to uncover human immunodeficiency virus (HIV) transmission flow patterns at the population level. We represent pairings of individuals with viral sequence data as typed points, with coordinates representing covariates such as gender and age and point types representing the unobserved transmission statuses (linkage and direction). Points are associated with observed scores on the strength of evidence for each transmission status that are obtained through standard deep-sequence phylogenetic analysis. Our method is able to jointly infer the latent transmission statuses for all pairings and the transmission flow surface on the source-recipient covariate space. In contrast to existing methods, our framework does not require preclassification of the transmission statuses of data points, and instead learns them probabilistically through a fully Bayesian inference scheme. By directly modeling continuous spatial processes with smooth densities, our method enjoys significant computational advantages compared to previous methods that rely on discretization of the covariate space. We demonstrate that our framework can capture age structures in HIV transmission at high resolution, bringing valuable insights in a case study on viral deep-sequencing data from Southern Uganda.

     
    more » « less
  2. Throughout the course of an epidemic, the rate at which disease spreads varies with behavioral changes, the emergence of new disease variants, and the introduction of mitigation policies. Estimating such changes in transmission rates can help us better model and predict the dynamics of an epidemic, and provide insight into the efficacy of control and intervention strategies. We present a method for likelihood‐based estimation of parameters in the stochastic susceptible‐infected‐removed model under a time‐inhomogeneous transmission rate comprised of piecewise constant components. In doing so, our method simultaneously learns change points in the transmission rate via a Markov chain Monte Carlo algorithm. The method targets the exact model posterior in a difficult missing data setting given only partially observed case counts over time. We validate performance on simulated data before applying our approach to data from an Ebola outbreak in Western Africa and COVID‐19 outbreak on a university campus.

     
    more » « less
  3. Word embeddings, which represent words as dense feature vectors, are widely used in natural language processing. In their seminal paper on word2vec, Mikolov and colleagues showed that a feature space created by training a word prediction network on a large text corpus will encode semantic information that supports analogy by vector arithmetic, e.g., "king" minus "man" plus "woman" equals "queen". To help novices appreciate this idea, people have sought effective graphical representations of word embeddings.We describe a new interactive tool for visually exploring word embeddings. Our tool allows users to define semantic dimensions by specifying opposed word pairs, e.g., gender is defined by pairs such as boy/girl and father/mother, and age by pairs such as father/son and mother/daughter. Words are plotted as points in a zoomable and rotatable 3D space, where the third ”residual” dimension encodes distance from the hyperplane defined by all the opposed word vectors with age and gender subtracted out. Our tool allows users to visualize vector analogies, drawing the vector from “king” to “man” and a parallel vector from “woman” to “king-man+woman”, which is closest to “queen”. Visually browsing the embedding space and experimenting with this tool can make word embeddings more intuitive. We include a series of experiments teachers can use to help K-12 students appreciate the strengths and limitations of this representation. 
    more » « less
  4. Word embeddings, which represent words as dense feature vectors, are widely used in natural language processing. In their seminal paper on word2vec, Mikolov and colleagues showed that a feature space created by training a word prediction network on a large text corpus will encode semantic information that supports analogy by vector arithmetic, e.g., "king" minus "man" plus "woman" equals "queen". To help novices appreciate this idea, people have sought effective graphical representations of word embeddings.We describe a new interactive tool for visually exploring word embeddings. Our tool allows users to define semantic dimensions by specifying opposed word pairs, e.g., gender is defined by pairs such as boy/girl and father/mother, and age by pairs such as father/son and mother/daughter. Words are plotted as points in a zoomable and rotatable 3D space, where the third ”residual” dimension encodes distance from the hyperplane defined by all the opposed word vectors with age and gender subtracted out. Our tool allows users to visualize vector analogies, drawing the vector from “king” to “man” and a parallel vector from “woman” to “king-man+woman”, which is closest to “queen”. Visually browsing the embedding space and experimenting with this tool can make word embeddings more intuitive. We include a series of experiments teachers can use to help K-12 students appreciate the strengths and limitations of this representation. 
    more » « less
  5. Intensive care occupancy is an important indicator of health care stress that has been used to guide policy decisions during the COVID‐19 pandemic. Toward reliable decision‐making as a pandemic progresses, estimating the rates at which patients are admitted to and discharged from hospitals and intensive care units (ICUs) is crucial. Since individual‐level hospital data are rarely available to modelers in each geographic locality of interest, it is important to develop tools for inferring these rates from publicly available daily numbers of hospital and ICU beds occupied. We develop such an estimation approach based on an immigration‐death process that models fluctuations of ICU occupancy. Our flexible framework allows for immigration and death rates to depend on covariates, such as hospital bed occupancy and daily SARS‐CoV‐2 test positivity rate, which may drive changes in hospital ICU operations. We demonstrate via simulation studies that the proposed method performs well on noisy time series data and apply our statistical framework to hospitalization data from the University of California, Irvine (UCI) Health and Orange County, California. By introducing a likelihood‐based framework where immigration and death rates can vary with covariates, we find, through rigorous model selection, that hospitalization and positivity rates are crucial covariates for modeling ICU stay dynamics and validate our per‐patient ICU stay estimates using anonymized patient‐level UCI hospital data.

     
    more » « less
  6. Ndeffo Mbah, Martial L (Ed.)
    The SARS-CoV-2 pandemic led to closure of nearly all K-12 schools in the United States of America in March 2020. Although reopening K-12 schools for in-person schooling is desirable for many reasons, officials understand that risk reduction strategies and detection of cases are imperative in creating a safe return to school. Furthermore, consequences of reclosing recently opened schools are substantial and impact teachers, parents, and ultimately educational experiences in children. To address competing interests in meeting educational needs with public safety, we compare the impact of physical separation through school cohorts on SARS-CoV-2 infections against policies acting at the level of individual contacts within classrooms. Using an age-stratified Susceptible-Exposed-Infected-Removed model, we explore influences of reduced class density, transmission mitigation, and viral detection on cumulative prevalence. We consider several scenarios over a 6-month period including (1) multiple rotating cohorts in which students cycle through in-person instruction on a weekly basis, (2) parallel cohorts with in-person and remote learning tracks, (3) the impact of a hypothetical testing program with ideal and imperfect detection, and (4) varying levels of aggregate transmission reduction. Our mathematical model predicts that reducing the number of contacts through cohorts produces a larger effect than diminishing transmission rates per contact. Specifically, the latter approach requires dramatic reduction in transmission rates in order to achieve a comparable effect in minimizing infections over time. Further, our model indicates that surveillance programs using less sensitive tests may be adequate in monitoring infections within a school community by both keeping infections low and allowing for a longer period of instruction. Lastly, we underscore the importance of factoring infection prevalence in deciding when a local outbreak of infection is serious enough to require reverting to remote learning. 
    more » « less
  7. null (Ed.)
  8. null (Ed.)
    Background Health care personnel (HCP) are at high risk for exposure to the SARS-CoV-2 virus. While personal protective equipment (PPE) may mitigate this risk, prospective data collection on its use and other risk factors for seroconversion in this population is needed. Objective The primary objectives of this study are to (1) determine the incidence of, and risk factors for, SARS-CoV-2 infection among HCP at a tertiary care medical center and (2) actively monitor PPE use, interactions between study participants via electronic sensors, secondary cases in households, and participant mental health and well-being. Methods To achieve these objectives, we designed a prospective, observational study of SARS-CoV-2 infection among HCP and their household contacts at an academic tertiary care medical center in North Carolina, USA. Enrolled HCP completed frequent surveys on symptoms and work activities and provided serum and nasal samples for SARS-CoV-2 testing every 2 weeks. Additionally, interactions between participants and their movement within the clinical environment were captured with a smartphone app and Bluetooth sensors. Finally, a subset of participants’ households was randomly selected every 2 weeks for further investigation, and enrolled households provided serum and nasal samples via at-home collection kits. Results As of December 31, 2020, 211 HCP and 53 household participants have been enrolled. Recruitment and follow-up are ongoing and expected to continue through September 2021. Conclusions Much remains to be learned regarding the risk of SARS-CoV-2 infection among HCP and their household contacts. Through the use of a multifaceted prospective study design and a well-characterized cohort, we will collect critical information regarding SARS-CoV-2 transmission risks in the health care setting and its linkage to the community. International Registered Report Identifier (IRRID) DERR1-10.2196/25410 
    more » « less