skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Randomized algorithms for generalized singular value decomposition with application to sensitivity analysis
Abstract

The generalized singular value decomposition (GSVD) is a valuable tool that has many applications in computational science. However, computing the GSVD for large‐scale problems is challenging. Motivated by applications in hyper‐differential sensitivity analysis (HDSA), we propose new randomized algorithms for computing the GSVD which use randomized subspace iteration and weighted QR factorization. Detailed error analysis is given which provides insight into the accuracy of the algorithms and the choice of the algorithmic parameters. We demonstrate the performance of our algorithms on test matrices and a large‐scale model problem where HDSA is used to study subsurface flow.

 
more » « less
Award ID(s):
1821149 1845406
PAR ID:
10450909
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Numerical Linear Algebra with Applications
Volume:
28
Issue:
4
ISSN:
1070-5325
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Gaussian process (GP) is a staple in the toolkit of a spatial statistician. Well‐documented computing roadblocks in the analysis of large geospatial datasets using GPs have now largely been mitigated via several recent statistical innovations. Nearest neighbor Gaussian process (NNGP) has emerged as one of the leading candidates for such massive‐scale geospatial analysis owing to their empirical success. This article reviews the connection of NNGP to sparse Cholesky factors of the spatial precision (inverse‐covariance) matrix. Focus of the review is on these sparse Cholesky matrices which are versatile and have recently found many diverse applications beyond the primary usage of NNGP for fast parameter estimation and prediction in the spatial (generalized) linear models. In particular, we discuss applications of sparse NNGP Cholesky matrices to address multifaceted computational issues in spatial bootstrapping, simulation of large‐scale realizations of Gaussian random fields, and extensions to nonparametric mean function estimation of a GP using random forests. We also review a sparse‐Cholesky‐based model for areal (geographically aggregated) data that addresses long‐established interpretability issues of existing areal models. Finally, we highlight some yet‐to‐be‐addressed issues of such sparse Cholesky approximations that warrant further research.

    This article is categorized under:

    Algorithms and Computational Methods > Algorithms

    Algorithms and Computational Methods > Numerical Methods

     
    more » « less
  2. The Tucker tensor decomposition is a natural extension of the singular value decomposition (SVD) to multiway data. We propose to accelerate Tucker tensor decomposition algorithms by using randomization and parallelization. We present two algorithms that scale to large data and many processors, significantly reduce both computation and communication cost compared to previous deterministic and randomized approaches, and obtain nearly the same approximation errors. The key idea in our algorithms is to perform randomized sketches with Kronecker-structured random matrices, which reduces computation compared to unstructured matrices and can be implemented using a fundamental tensor computational kernel. We provide probabilistic error analysis of our algorithms and implement a new parallel algorithm for the structured randomized sketch. Our experimental results demonstrate that our combination of randomization and parallelization achieves accurate Tucker decompositions much faster than alternative approaches. We observe up to a 16X speedup over the fastest deterministic parallel implementation on 3D simulation data. 
    more » « less
  3. Mikołaj Boja´nczyk, Emanuela Merelli (Ed.)
    We initiate a systematic study of algorithms that are both differentially-private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $(1+\rho)$-approximation algorithm for the problem of computing the average degree of a graph, for every $\rho>0$. The running time of the algorithm is roughly the same (for sparse graphs) as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of \emph{coupled global sensitivity} of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms. 
    more » « less
  4. A skyline query searches the data points that are not dominated by others in the dataset. It is widely adopted for many applications which require multi-criteria decision making. However, skyline query processing is considerably time-consuming for a high-dimensional large scale dataset. Parallel computing techniques are therefore needed to address this challenge, among which MapReduce is one of the most popular frameworks to process big data. A great number of efficient MapReduce skyline algorithms have been proposed in the literature and most of their designs focus on partitioning and pruning the given dataset. However, there are still opportunities for further parallelism. In this study, we propose two parallel skyline processing algorithms using a novel LShape partitioning strategy and an effective Propagation Filtering method. These two algorithms are 2Phase LShape and 1Phase LShape, used for multiple reducers and single reducer, respectively. By extensive experiments, we verify that our algorithms outperformed the state-of-the-art approaches, especially for high-dimensional large scale datasets. 
    more » « less
  5. Several variants of the subgraph isomorphism problem, e.g., finding, counting and estimating frequencies of subgraphs in networks arise in a number of real world applications, such as web analysis, disease diffusion prediction and social network analysis. These problems are computationally challenging in having to scale to very large networks with millions of vertices. In this paper, we present SAHAD, a MapReduce algorithm for detecting and counting trees of bounded size using the elegant color coding technique developed by N. Alon et al. SAHAD is a randomized algorithm, and we show rigorous bounds on the approximation quality and the performance of it. SAHAD scales to very large networks comprising of 107-108 edges and tree-like (acyclic) templates with up to 12 vertices. Further, we extend our results by implementing SAHAD in the Harp framework, which is more of a high performance computing environment. The new implementation gives 100x improvement in performance over the standard Hadoop implementation and achieves better performance than state-of-the-art MPI solutions on larger graphs. 
    more » « less