skip to main content

Search for: All records

Creators/Authors contains: "Yang, Shihao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The prolonged COVID-19 pandemic has tied up significant medical resources, and its management poses a challenge for the public health care decision making. Accurate predictions of the hospitalizations are crucial for the decision makers to make informed decision for the medical resource allocation. This paper proposes a method named County Augmented Transformer (CAT). To generate accurate predictions of four-week-ahead COVID-19 related hospitalizations for every states in the United States. Inspired by the modern deep learning techniques, our method is based on a self-attention model (known as the transformer model) that is actively used in Natural Language Processing. Our transformer based model can capture both short-term and long-term dependencies within the time series while enjoying computational efficiency. Our model is a data based approach that utilizes the publicly available information including the COVID-19 related number of confirmed cases, deaths, hospitalizations data, and the household median income data. Our numerical experiments demonstrate the strength and the usability of our model as a potential tool for assisting the medical resources allocation.

    more » « less
    Free, publicly-accessible full text available December 1, 2024
  2. Abstract The global spread of COVID-19, the disease caused by the novel coronavirus SARS-CoV-2, has casted a significant threat to mankind. As the COVID-19 situation continues to evolve, predicting localized disease severity is crucial for advanced resource allocation. This paper proposes a method named COURAGE (COUnty aggRegation mixup AuGmEntation) to generate a short-term prediction of 2-week-ahead COVID-19 related deaths for each county in the United States, leveraging modern deep learning techniques. Specifically, our method adopts a self-attention model from Natural Language Processing, known as the transformer model, to capture both short-term and long-term dependencies within the time series while enjoying computational efficiency. Our model solely utilizes publicly available information for COVID-19 related confirmed cases, deaths, community mobility trends and demographic information, and can produce state-level predictions as an aggregation of the corresponding county-level predictions. Our numerical experiments demonstrate that our model achieves the state-of-the-art performance among the publicly available benchmark models. 
    more » « less
  3. null (Ed.)
    Abstract For epidemics control and prevention, timely insights of potential hot spots are invaluable. Alternative to traditional epidemic surveillance, which often lags behind real time by weeks, big data from the Internet provide important information of the current epidemic trends. Here we present a methodology, ARGOX (Augmented Regression with GOogle data CROSS space), for accurate real-time tracking of state-level influenza epidemics in the United States. ARGOX combines Internet search data at the national, regional and state levels with traditional influenza surveillance data from the Centers for Disease Control and Prevention, and accounts for both the spatial correlation structure of state-level influenza activities and the evolution of people’s Internet search pattern. ARGOX achieves on average 28% error reduction over the best alternative for real-time state-level influenza estimation for 2014 to 2020. ARGOX is robust and reliable and can be potentially applied to track county- and city-level influenza activity and other infectious diseases. 
    more » « less
  4. We present an interpretable high-resolution spatio-temporal model to estimate COVID-19 deaths together with confirmed cases 1 week ahead of the current time, at the county level and weekly aggregated, in the United States. A notable feature of our spatio-temporal model is that it considers the (1) temporal auto- and pairwise correlation of the two local time series (confirmed cases and deaths from the COVID-19), (2) correlation between locations (propagation between counties), and (3) covariates such as local within-community mobility and social demographic factors. The within-community mobility and demographic factors, such as total population and the proportion of the elderly, are included as important predictors since they are hypothesized to be important in determining the dynamics of COVID-19. To reduce the model’s high dimensionality, we impose sparsity structures as constraints and emphasize the impact of the top 10 metropolitan areas in the nation, which we refer to (and treat within our models) as hubs in spreading the disease. Our retrospective out-of-sample county-level predictions were able to forecast the subsequently observed COVID-19 activity accurately. The proposed multivariate predictive models were designed to be highly interpretable, with clear identification and quantification of the most important factors that determine the dynamics of COVID-19. Ongoing work involves incorporating more covariates, such as education and income, to improve prediction accuracy and model interpretability. 
    more » « less
  5. null (Ed.)
    Parameter estimation for nonlinear dynamic system models, represented by ordinary differential equations (ODEs), using noisy and sparse data, is a vital task in many fields. We propose a fast and accurate method, manifold-constrained Gaussian process inference (MAGI), for this task. MAGI uses a Gaussian process model over time series data, explicitly conditioned on the manifold constraint that derivatives of the Gaussian process must satisfy the ODE system. By doing so, we completely bypass the need for numerical integration and achieve substantial savings in computational time. MAGI is also suitable for inference with unobserved system components, which often occur in real experiments. MAGI is distinct from existing approaches as we provide a principled statistical construction under a Bayesian framework, which incorporates the ODE system through the manifold constraint. We demonstrate the accuracy and speed of MAGI using realistic examples based on physical experiments. 
    more » « less
  6. Abstract

    Accurate, high-resolution tracking of influenza epidemics at the regional level helps public health agencies make informed and proactive decisions, especially in the face of outbreaks. Internet users’ online searches offer great potential for the regional tracking of influenza. However, due to the complex data structure and reduced quality of Internet data at the regional level, few established methods provide satisfactory performance. In this article, we propose a novel method named ARGO2 (2-step Augmented Regression with GOogle data) that efficiently combines publicly available Google search data at different resolutions (national and regional) with traditional influenza surveillance data from the Centers for Disease Control and Prevention (CDC) for accurate, real-time regional tracking of influenza. ARGO2 gives very competitive performance across all US regions compared with available Internet-data-based regional influenza tracking methods, and it has achieved 30% error reduction over the best alternative method that we numerically tested for the period of March 2009 to March 2018. ARGO2 is reliable and robust, with the flexibility to incorporate additional information from other sources and resolutions, making it a powerful tool for regional influenza tracking, and potentially for tracking other social, economic, or public health events at the regional or local level.

    more » « less