skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 22, 2026

Title: Adaptive Parallelizable Algorithms for Interpolative Decompositions via Partially Pivoted LU
ABSTRACT Interpolative decompositions (ID) involve “natural bases” of row and column subsets, or skeletons, of a given matrix that approximately span its row and column spaces. Although finding optimal skeleton subsets is combinatorially hard, classical greedy pivoting algorithms with rank‐revealing properties like column‐pivoted QR (CPQR) often provide good heuristics in practice. To select skeletons efficiently for large matrices, randomized sketching is commonly leveraged as a preprocessing step to reduce the problem dimension while preserving essential information in the matrix. In addition to accelerating computations, randomization via sketching improves robustness against adversarial inputs while relaxing the rank‐revealing assumption on the pivoting scheme. This enables faster skeleton selection based on LU with partial pivoting (LUPP) as a reliable alternative to rank‐revealing pivoting methods like CPQR. However, while coupling sketching with LUPP provides an efficient solution for ID with a given rank, the lack of rank‐revealing properties of LUPP makes it challenging to adaptively determine a suitable rank without prior knowledge of the matrix spectrum. As a remedy, in this work, we introduce an adaptive randomized LUPP algorithm that approximates the desired rank via fast estimation of the residual error. The resulting algorithm is not only adaptive but also parallelizable, attaining much higher practical speed due to the lower communication requirements of LUPP over CPQR. The method has been implemented for both CPUs and GPUs, and the resulting software has been made publicly available.  more » « less
Award ID(s):
2401889
PAR ID:
10573952
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Numerical Linear Algebra with Applications
Volume:
32
Issue:
1
ISSN:
1070-5325
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures. 
    more » « less
  2. We consider algorithms with access to an unknown matrix M ε F n×d via matrix-vector products , namely, the algorithm chooses vectors v 1 , ⃛ , v q , and observes Mv 1 , ⃛ , Mv q . Here the v i can be randomized as well as chosen adaptively as a function of Mv 1 , ⃛ , Mv i-1 . Motivated by applications of sketching in distributed computation, linear algebra, and streaming models, as well as connections to areas such as communication complexity and property testing, we initiate the study of the number q of queries needed to solve various fundamental problems. We study problems in three broad categories, including linear algebra, statistics problems, and graph problems. For example, we consider the number of queries required to approximate the rank, trace, maximum eigenvalue, and norms of a matrix M; to compute the AND/OR/Parity of each column or row of M, to decide whether there are identical columns or rows in M or whether M is symmetric, diagonal, or unitary; or to compute whether a graph defined by M is connected or triangle-free. We also show separations for algorithms that are allowed to obtain matrix-vector products only by querying vectors on the right, versus algorithms that can query vectors on both the left and the right. We also show separations depending on the underlying field the matrix-vector product occurs in. For graph problems, we show separations depending on the form of the matrix (bipartite adjacency versus signed edge-vertex incidence matrix) to represent the graph. Surprisingly, very few works discuss this fundamental model, and we believe a thorough investigation of problems in this model would be beneficial to a number of different application areas. 
    more » « less
  3. We introduce a Generalized LU Factorization (GLU) for low-rank matrix approximation. We relate this to past approaches and extensively analyze its approximation properties. The established deterministic guarantees are combined with sketching ensembles satisfying Johnson-- Lindenstrauss properties to present complete bounds. Particularly good performance is shown for the subsampled randomized Hadamard transform (SRHT) ensemble. Moreover, the factorization is shown to unify and generalize many past algorithms, sometimes providing strictly better approximations. It also helps to explain the effect of sketching on the growth factor during Gaussian elimination. 
    more » « less
  4. Merge trees are a type of topological descriptors that record the connectivity among the sublevel sets of scalar fields. They are among the most widely used topological tools in visualization. In this paper, we are interested in sketching a set of merge trees using techniques from matrix sketching. That is, given a large set T of merge trees, we would like to find a much smaller set of basis trees S such that each tree in T can be approximately reconstructed from a linear combination of merge trees in S. A set of high-dimensional vectors can be approximated via matrix sketching techniques such as principal component analysis and column subset selection. However, until now, there has not been any work on sketching a set of merge trees. We develop a framework for sketching a set of merge trees that combines matrix sketching with tools from optimal transport. In particular, we vectorize a set of merge trees into high-dimensional vectors while preserving their structures and structural relations. We demonstrate the applications of our framework in sketching merge trees that arise from time-varying scientific simulations. Specifically, our framework obtains a set of basis trees as representatives that capture the “modes” of physical phenomena for downstream analysis and visualization. 
    more » « less
  5. We study the joint low-rank factorization of the matrices X=[A B]G and Y=[A C]H, in which the columns of the shared factor matrix A correspond to vectorized rank-one matrices, the unshared factors B and C have full column rank, and the matrices G and H have full row rank. The objective is to find the shared factor A, given only X and Y. We first explain that if the matrix [A B C] has full column rank, then a basis for the column space of the shared factor matrix A can be obtained from the null space of the matrix [X Y]. This in turn implies that the problem of finding the shared factor matrix A boils down to a basic Canonical Polyadic Decomposition (CPD) problem that in many cases can directly be solved by means of an eigenvalue decomposition. Next, we explain that by taking the rank-one constraint of the columns of the shared factor matrix A into account when computing the null space of the matrix [X Y], more relaxed identifiability conditions can be obtained that do not require that [A B C] has full column rank. The benefit of the unconstrained null space approach is that it leads to simple algorithms while the benefit of the rank-one constrained null space approach is that it leads to relaxed identifiability conditions. Finally, a joint unbalanced orthogonal Procrustes and CPD fitting approach for computing the shared factor matrix A from noisy observation matrices X and Y will briefly be discussed. 
    more » « less