The exponential scaling of complete active space and full configuration interaction (CI) calculations limits the ability of quantum chemists to simulate the electronic structures of strongly correlated systems. Herein, we present corner hierarchically approximated CI (CHACI), an approach to wave function compression based on corner hierarchical matrices (CH-matrices)—a new variant of hierarchical matrices based on block-wise low-rank decomposition. By application to dodecacene, a strongly correlated molecule, we demonstrate that CH matrix compression provides superior compression compared to truncated global singular value decomposition. The compression ratio is shown to improve with increasing active space size. By comparison of several alternative schemes, we demonstrate that superior compression is achieved by (a) using a blocking approach that emphasizes the upper-left corner of the CI vector, (b) sorting the CI vector prior to compression, and (c) optimizing the rank of each block to maximize information density.
more »
« less
Joint Low-Rank Factorizations with Shared and Unshared Components: Identifiability and Algorithms
We study the joint low-rank factorization of the matrices X=[A B]G and Y=[A C]H, in which the columns of the shared factor matrix A correspond to vectorized rank-one matrices, the unshared factors B and C have full column rank, and the matrices G and H have full row rank. The objective is to find the shared factor A, given only X and Y. We first explain that if the matrix [A B C] has full column rank, then a basis for the column space of the shared factor matrix A can be obtained from the null space of the matrix [X Y]. This in turn implies that the problem of finding the shared factor matrix A boils down to a basic Canonical Polyadic Decomposition (CPD) problem that in many cases can directly be solved by means of an eigenvalue decomposition. Next, we explain that by taking the rank-one constraint of the columns of the shared factor matrix A into account when computing the null space of the matrix [X Y], more relaxed identifiability conditions can be obtained that do not require that [A B C] has full column rank. The benefit of the unconstrained null space approach is that it leads to simple algorithms while the benefit of the rank-one constrained null space approach is that it leads to relaxed identifiability conditions. Finally, a joint unbalanced orthogonal Procrustes and CPD fitting approach for computing the shared factor matrix A from noisy observation matrices X and Y will briefly be discussed.
more »
« less
- Award ID(s):
- 1704074
- PAR ID:
- 10169182
- Date Published:
- Journal Name:
- 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2019
- Page Range / eLocation ID:
- 1 to 5
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the rank of a class of sparse Boolean matrices of size $$n \times n$$. In particular, we show that the probability that such a matrix has full rank, and is thus invertible, is a positive constant with value about $0.2574$ for large $$n$$. The matrices arise as the vertex-edge incidence matrix of 1-out 3-uniform hypergraphs. The result that the null space is bounded in expectation, can be contrasted with results for the usual models of sparse Boolean matrices, based on the vertex-edge incidence matrix of random $$k$$-uniform hypergraphs. For this latter model, the expected co-rank is linear in the number of vertices $$n$$, \cite{ACO}, \cite{CFP}. For fields of higher order, the co-rank is typically Poisson distributed.more » « less
-
QR factorization is a key tool in mathematics, computer science, operations research, and engineering. This paper presents the roundoff-error-free (REF) QR factorization framework comprising integer-preserving versions of the standard and the thin QR factorizations and associated algorithms to compute them. Specifically, the standard REF QR factorization factors a given matrix $$A \in \Z^{m \times n}$$ as $A=QDR$, where $$Q \in \Z^{m \times m}$$ has pairwise orthogonal columns, $$D$$ is a diagonal matrix, and $$R \in \Z^{m \times n}$$ is an upper trapezoidal matrix; notably, the entries of $$Q$$ and $$R$$ are integral, while the entries of $$D$$ are reciprocals of integers. In the thin REF QR factorization, $$Q \in \Z^{m \times n}$$ also has pairwise orthogonal columns, and $$R \in \Z^{n \times n}$$ is also an upper triangular matrix. In contrast to traditional (i.e., floating-point) QR factorizations, every operation used to compute these factors is integral; thus, REF QR is guaranteed to be an exact orthogonal decomposition. Importantly, the bit-length of every entry in the REF QR factorizations (and within the algorithms to compute them) is bounded polynomially. Notable applications of our REF QR factorizations include finding exact least squares or exact basic solutions (i.e., a rational n-dimensional vector $$x$$) to any given full column rank or rank deficient linear system $A x = b$, respectively. In addition, our exact factorizations can be used as a subroutine within exact and/or high-precision quadratic programming. Altogether, REF QR provides a framework to obtain exact orthogonal factorizations of any rational matrix (as any rational/decimal matrix can be easily transformed into an integral matrix).more » « less
-
We address the problem of high-rank matrix completion with side information. In contrast to existing work dealing with side information, which assume that the data matrix is low-rank, we consider the more general scenario where the columns of the data matrix are drawn from a union of low-dimensional subspaces, which can lead to a high rank matrix. Our goal is to complete the matrix while taking advantage of the side information. To do so, we use the self-expressive property of the data, searching for a sparse representation of each column of matrix as a combination of a few other columns. More specifically, we propose a factorization of the data matrix as the product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of other columns. As our proposed optimization, searching for missing entries and sparse coefficients, is non-convex and NP-hard, we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We also propose a fast implementation of our convex framework using a Linearized Alternating Direction Method. By extensive experiments on both synthetic and real data, and, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimes.more » « less
-
We address the problem of high-rank matrix completion with side information. In contrast to existing work dealing with side information, which assume that the data matrix is low-rank, we consider the more general scenario where the columns of the data matrix are drawn from a union of low-dimensional subspaces, which can lead to a high rank matrix. Our goal is to complete the matrix while taking advantage of the side information. To do so, we use the self-expressive property of the data, searching for a sparse representation of each column of matrix as a combination of a few other columns. More specifically, we propose a factorization of the data matrix as the product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of other columns. As our proposed optimization, searching for missing entries and sparse coefficients, is non-convex and NP-hard, we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We also propose a fast implementation of our convex framework using a Linearized Alternating Direction Method. By extensive experiments on both synthetic and real data, and, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimesmore » « less