skip to main content


Title: Sampling and multilevel coarsening algorithms for fast matrix approximations
Summary

This paper addresses matrix approximation problems for matrices that are large, sparse, and/or representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed, which utilizes a hypergraph associated with the data matrix and a graph coarsening strategy based on column matching. We consider a number of standard applications of this technique as well as a few new ones. Among standard applications, we first consider the problem of computingpartial singular value decomposition, for which a combination of sampling and coarsening yields significantly improved singular value decomposition results relative to sampling alone. We also consider thecolumn subset selectionproblem, a popular low‐rank approximation method used in data‐related applications, and show how multilevel coarsening can be adapted for this problem. Similarly, we consider the problem ofgraph sparsificationand show how coarsening techniques can be employed to solve it. We also establish theoretical results that characterize the approximation error obtained and the quality of the dimension reduction achieved by a coarsening step, when a proper column matching strategy is employed. Numerical experiments illustrate the performances of the methods in a few applications.

 
more » « less
NSF-PAR ID:
10461118
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Numerical Linear Algebra with Applications
Volume:
26
Issue:
3
ISSN:
1070-5325
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The multilevel heuristic is an effective strategy for speeding up graph analytics, and graph coarsening is an integral step of multilevel methods. We perform a comprehensive study of multilevel coarsening in this work. We primarily focus on the graphics processing unit (GPU) parallelization of the Heavy Edge Coarsening (HEC) method executed in an iterative setting. We present optimizations for the two phases of coarsening, a fine-to-coarse vertex mapping phase, and a coarse graph construction phase. We also express several other coarsening algorithms using the Kokkos framework and discuss their parallelization. We demonstrate the efficacy of parallelized HEC on an NVIDIA Turing GPU and a 32-core AMD Ryzen processor using multilevel spectral graph partitioning as the primary case study. 
    more » « less
  2. Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures. 
    more » « less
  3. The focusing inversion of gravity and magnetic potential-field data using the randomized singular value decomposition (RSVD) method is considered. This approach facilitates tackling the computational challenge that arises in the solution of the inversion problem that uses the standard and accurate approximation of the integral equation kernel. We have developed a comprehensive comparison of the developed methodology for the inversion of magnetic and gravity data. The results verify that there is an important difference between the application of the methodology for gravity and magnetic inversion problems. Specifically, RSVD is dependent on the generation of a rank [Formula: see text] approximation to the underlying model matrix, and the results demonstrate that [Formula: see text] needs to be larger, for equivalent problem sizes, for the magnetic problem compared to the gravity problem. Without a relatively large [Formula: see text], the dominant singular values of the magnetic model matrix are not well approximated. We determine that this is due to the spectral properties of the matrix. The comparison also shows us how the use of the power iteration embedded within the randomized algorithm improves the quality of the resulting dominant subspace approximation, especially in magnetic inversion, yielding acceptable approximations for smaller choices of [Formula: see text]. Further, we evaluate how the differences in spectral properties of the magnetic and gravity input matrices also affect the values that are automatically estimated for the regularization parameter. The algorithm is applied and verified for the inversion of magnetic data obtained over a portion of the Wuskwatim Lake region in Manitoba, Canada. 
    more » « less
  4. The CP tensor decomposition is used in applications such as machine learning and signal processing to discover latent low-rank structure in multidimensional data. Computing a CP decomposition via an alternating least squares (ALS) method reduces the problem to several linear least squares problems. The standard way to solve these linear least squares subproblems is to use the normal equations, which inherit special tensor structure that can be exploited for computational efficiency. However, the normal equations are sensitive to numerical ill-conditioning, which can compromise the results of the decomposition. In this paper, we develop versions of the CP-ALS algorithm using the QR decomposition and the singular value decomposition, which are more numerically stable than the normal equations, to solve the linear least squares problems. Our algorithms utilize the tensor structure of the CP-ALS subproblems efficiently, have the same complexity as the standard CP-ALS algorithm when the input is dense and the rank is small, and are shown via examples to produce more stable results when ill-conditioning is present. Our MATLAB implementation achieves the same running time as the standard algorithm for small ranks, and we show that the new methods can obtain lower approximation error. 
    more » « less
  5. Abstract

    The general method of graph coarsening or graph reduction has been a remarkably useful and ubiquitous tool in scientific computing and it is now just starting to have a similar impact in machine learning. The goal of this paper is to take a broad look into coarsening techniques that have been successfully deployed in scientific computing and see how similar principles are finding their way in more recent applications related to machine learning. In scientific computing, coarsening plays a central role in algebraic multigrid methods as well as the related class of multilevel incomplete LU factorizations. In machine learning, graph coarsening goes under various names, e.g., graph downsampling or graph reduction. Its goal in most cases is to replace some original graph by one which has fewer nodes, but whose structure and characteristics are similar to those of the original graph. As will be seen, a common strategy in these methods is to rely on spectral properties to define the coarse graph.

     
    more » « less