skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Covid-19 Pandemic Data Analysis Using Tensor Methods
In this paper, we use tensor models to analyze the Covid-19 pandemic data. First, we use tensor models, canonical polyadic, and higher-order Tucker decompositions to extract patterns over multiple modes. Second, we implement a tensor completion algorithm using canonical polyadic tensor decomposition to predict spatiotemporal data from multiple spatial sources and to identify Covid-19 hotspots. We apply a regularized iterative tensor completion technique with a practical regularization parameter estimator to predict the spread of Covid-19 cases and to find and identify hotspots. Our method can predict weekly, and quarterly Covid-19 spreads with high accuracy. Third, we analyze Covid-19 data in the US using a novel sampling method for alternating leastsquares. Moreover, we compare the algorithms with standard tensor decompositions concerning their interpretability, visualization, and cost analysis. Finally, we demonstrate the efficacy of the methods by applying the techniques to the New Jersey Covid-19 case tensor data.  more » « less
Award ID(s):
2126374
PAR ID:
10609514
Author(s) / Creator(s):
; ;
Publisher / Repository:
Computational Algorithms and Numerical Dimensions
Date Published:
Journal Name:
Computational Algorithms and Numerical Dimensions
Volume:
3
Issue:
1
ISSN:
2980-9320
Page Range / eLocation ID:
17-44
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The analysis of functional near-infrared spectroscopy (fNIRS) signals has not kept pace with the increased use of fNIRS in the behavioral and brain sciences. The popular grand averaging method collapses the oxygenated hemoglobin data within a predefined time of interest window and across multiple channels within a region of interest, potentially leading to a loss of important temporal and spatial information. On the other hand, the tensor decomposition method can reveal patterns in the data without making prior assumptions of the hemodynamic response and without losing temporal and spatial information. The aim of the current study was to examine whether the tensor decomposition method could identify significant effects and novel patterns compared to the commonly used grand averaging method for fNIRS signal analysis. We used two infant fNIRS datasets and applied tensor decomposition (i.e., canonical polyadic and Tucker decompositions) to analyze the significant differences in the hemodynamic response patterns across conditions. The codes are publicly available on GitHub. Bayesian analyses were performed to understand interaction effects. The results from the tensor decomposition method replicated the findings from the grand averaging method and uncovered additional patterns not detected by the grand averaging method. Our findings demonstrate that tensor decomposition is a feasible alternative method for analyzing fNIRS signals, offering a more comprehensive understanding of the data and its underlying patterns. 
    more » « less
  2. null (Ed.)
    Learning nonlinear functions from input-output data pairs is one of the most fundamental problems in machine learning. Recent work has formulated the problem of learning a general nonlinear multivariate function of discrete inputs, as a tensor completion problem with smooth latent factors. We build upon this idea and utilize two ensemble learning techniques to enhance its prediction accuracy. Ensemble methods can be divided into two main groups, parallel and sequential. Bagging also known as bootstrap aggregation is a parallel ensemble method where multiple base models are trained in parallel on different subsets of the data that have been chosen randomly with replacement from the original training data. The output of these models is usually combined and a single prediction is computed using averaging. One of the most popular bagging techniques is random forests. Boosting is a sequential ensemble method where a sequence of base models are fit sequentially to modified versions of the data. Popular boosting algorithms include AdaBoost and Gradient Boosting. We develop two approaches based on these ensemble learning techniques for learning multivariate functions using the Canonical Polyadic Decomposition. We showcase the effectiveness of the proposed ensemble models on several regression tasks and report significant improvements compared to the single model. 
    more » « less
  3. Accurate prediction of the transmission of epidemic diseases such as COVID-19 is crucial for implementing effective mitigation measures. In this work, we develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously. We construct a 3-way spatio-temporal tensor (location, attribute, time) of case counts and propose a nonnegative tensor factorization with latent epidemiological model regularization named STELAR. Unlike standard tensor factorization methods which cannot predict slabs ahead, STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete time difference equations of a widely adopted epidemiological model. We use latent instead of location/attribute-level epidemiological dynamics to capture common epidemic profile sub-types and improve collaborative learning and prediction. We conduct experiments using both county- and state level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic. Finally, we evaluate the predictive ability of our method and show superior performance compared to the baselines, achieving up to 21% lower root mean square error and 25% lower mean absolute error for county-level prediction. 
    more » « less
  4. iscovering components that are shared in multiple datasets, next to dataset-specific features, has great potential for studying the relationships between different subjects or tasks in functional Magnetic Resonance Imaging (fMRI) data. Coupled matrix and tensor factorization approaches have been useful for flexible data fusion, or decomposition to extract features that can be used in multiple ways. However, existing methods do not directly recover shared and dataset-specific components, which requires post-processing steps involving additional hyperparameter selection. In this paper, we propose a tensor-based framework for multi-task fMRI data fusion, using a partially constrained canonical polyadic (CP) decomposition model. Differently from previous approaches, the proposed method directly recovers shared and dataset-specific components, leading to results that are directly interpretable. A strategy to select a highly reproducible solution to the decomposition is also proposed. We evaluate the proposed methodology on real fMRI data of three tasks, and show that the proposed method finds meaningful components that clearly identify group differences between patients with schizophrenia and healthy controls. 
    more » « less
  5. Gadekallu, Thippa Reddy (Ed.)
    As of March 30 2021, over 5,193 COVID-19 clinical trials have been registered through Clinicaltrial.gov. Among them, 191 trials were terminated, suspended, or withdrawn (indicating the cessation of the study). On the other hand, 909 trials have been completed (indicating the completion of the study). In this study, we propose to study underlying factors of COVID-19 trial completion vs . cessation, and design predictive models to accurately predict whether a COVID-19 trial may complete or cease in the future. We collect 4,441 COVID-19 trials from ClinicalTrial.gov to build a testbed, and design four types of features to characterize clinical trial administration, eligibility, study information, criteria, drug types, study keywords, as well as embedding features commonly used in the state-of-the-art machine learning. Our study shows that drug features and study keywords are most informative features, but all four types of features are essential for accurate trial prediction. By using predictive models, our approach achieves more than 0.87 AUC (Area Under the Curve) score and 0.81 balanced accuracy to correctly predict COVID-19 clinical trial completion vs . cessation. Our research shows that computational methods can deliver effective features to understand difference between completed vs . ceased COVID-19 trials. In addition, such models can also predict COVID-19 trial status with satisfactory accuracy, and help stakeholders better plan trials and minimize costs. 
    more » « less