Many scientific applications rely on sparse direct solvers for their numerical robustness. However, performance optimization for these solvers remains a challenging task, especially on GPUs. This is due to workloads of small dense matrices that are different in size. Matrix decompositions on such irregular workloads are rarely addressed on GPUs. This paper addresses irregular workloads of matrix computations on GPUs, and their application to accelerate sparse direct solvers. We design an interface for the basic matrix operations supporting problems of different sizes. The interface enables us to develop irrLU-GPU, an LU decomposition on matrices of different sizes. We demonstrate the impact of irrLU-GPU on sparse direct LU solvers using NVIDIA and AMD GPUs. Experimental results are shown for a sparse direct solver based on a multifrontal sparse LU decomposition applied to linear systems arising from the simulation, using finite element discretization on unstructured meshes, of a high-frequency indefinite Maxwell problem.
more »
« less
Accurate bidiagonal decompositions of Cauchy–Vandermonde matrices of any rank
Abstract We present a new decomposition of a Cauchy–Vandermonde matrix as a product of bidiagonal matrices which, unlike its existing bidiagonal decompositions, is now valid for a matrix of any rank. The new decompositions are insusceptible to the phenomenon known as subtractive cancellation in floating point arithmetic and are thus computable to high relative accuracy. In turn, other accurate matrix computations are also possible with these matrices, such as eigenvalue computation amongst others.
more »
« less
- PAR ID:
- 10531632
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Numerical Linear Algebra with Applications
- ISSN:
- 1070-5325
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)In this paper, we consider the decomposition of positive semidefinite matrices as a sum of rank one matrices. We introduce and investigate the properties of various measures of optimality of such decompositions. For some classes of positive semidefinite matrices, we give explicitly these optimal decompositions. These classes include diagonally dominant matrices and certain of their generalizations, 2 × 2, and a class of 3 × 3 matrices.more » « less
-
We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly outperforms the oblivious sampling method, which is the common choice in the recent literature. The adaptive subspace can be efficiently generated by a correlated random matrix ensemble whose statistics mimic the input data. We prove that the improvement in the relative error of the solution can be tightly characterized in terms of the spectrum of the data matrix, and provide probabilistic upper-bounds. We then illustrate the consequences of our theory with data matrices of different spectral decay. Extensive experimental results show that the proposed approach offers significant speed ups in machine learning problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units. Our analysis is based on convex analysis and Fenchel duality, and establishes connections to sketching and randomized matrix decompositions.more » « less
-
null (Ed.)We recover a video of the motion taking place in a hidden scene by observing changes in indirect illumination in a nearby uncalibrated visible region. We solve this problem by factoring the observed video into a matrix product between the unknown hidden scene video and an unknown light transport matrix. This task is extremely ill-posed as any non-negative factorization will satisfy the data. Inspired by recent work on the Deep Image Prior, we parameterize the factor matrices using randomly initialized convolutional neural networks trained in a one-off manner, and show that this results in decompositions that reflect the true motion in the hidden scene.more » « less
-
ABSTRACT Interpolative decompositions (ID) involve “natural bases” of row and column subsets, or skeletons, of a given matrix that approximately span its row and column spaces. Although finding optimal skeleton subsets is combinatorially hard, classical greedy pivoting algorithms with rank‐revealing properties like column‐pivoted QR (CPQR) often provide good heuristics in practice. To select skeletons efficiently for large matrices, randomized sketching is commonly leveraged as a preprocessing step to reduce the problem dimension while preserving essential information in the matrix. In addition to accelerating computations, randomization via sketching improves robustness against adversarial inputs while relaxing the rank‐revealing assumption on the pivoting scheme. This enables faster skeleton selection based on LU with partial pivoting (LUPP) as a reliable alternative to rank‐revealing pivoting methods like CPQR. However, while coupling sketching with LUPP provides an efficient solution for ID with a given rank, the lack of rank‐revealing properties of LUPP makes it challenging to adaptively determine a suitable rank without prior knowledge of the matrix spectrum. As a remedy, in this work, we introduce an adaptive randomized LUPP algorithm that approximates the desired rank via fast estimation of the residual error. The resulting algorithm is not only adaptive but also parallelizable, attaining much higher practical speed due to the lower communication requirements of LUPP over CPQR. The method has been implemented for both CPUs and GPUs, and the resulting software has been made publicly available.more » « less
An official website of the United States government
