skip to main content

Title: Approximately low-rank recovery from noisy and local measurements by convex program

Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the $\ell _1$ norm for sparsity through the singular value decomposition, other matrix norms also induce low-rankness. Particularly as one interprets a matrix as a linear operator between Banach spaces, various tensor product norms generalize the role of the nuclear norm. We provide a tensor-norm-constrained estimator for the recovery of approximately low-rank matrices from local measurements corrupted with noise. A tensor-norm regularizer is designed to adapt to the local structure. We derive statistical analysis of the estimator over matrix completion and decentralized sketching by applying Maurey’s empirical method to tensor products of Banach spaces. The estimator provides a near-optimal error bound in a minimax sense and admits a polynomial-time algorithm for these applications.

more » « less
Award ID(s):
1943201 1800872 1839177
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Information and Inference: A Journal of the IMA
Page Range / eLocation ID:
p. 1612-1654
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Low-rank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by non-convex optimization as it is well-known that for low-rank matrix problems like matrix sensing and matrix completion, all local optima of the natural non-convex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semi-random model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a low-rank matrix $X^\star$ from linear measurements $b_i = \langle A_i, X^\star \rangle$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $A_i$'s are chosen adversarially. It is known that in the semi-random model, existing non-convex objectives can have bad local optima. To fix this, we present a descent-style algorithm that provably recovers the ground-truth matrix $X^\star$. For the closely-related problem of semi-random matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NP-hard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semi-random sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and low-rankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and low-rank problems. 
    more » « less

    Repeatedly recording seismic data over a period of months or years is one way to identify trapped oil and gas and to monitor CO2 injection in underground storage reservoirs and saline aquifers. This process of recording data over time and then differencing the images assumes the recording of the data over a particular subsurface region is repeatable. In other words, the hope is that one can recover changes in the Earth when the survey parameters are held fixed between data collection times. Unfortunately, perfect experimental repeatability almost never occurs. Acquisition inconsistencies such as changes in weather (currents, wind) for marine seismic data are inevitable, resulting in source and receiver location differences between surveys at the very least. Thus, data processing aimed at improving repeatability between baseline and monitor surveys is extremely useful. One such processing tool is regularization (or binning) that aligns multiple surveys with different source or receiver configurations onto a common grid. Data binned onto a regular grid can be stored in a high-dimensional data structure called a tensor with, for example, x and y receiver coordinates and time as indices of the tensor. Such a higher-order data structure describing a subsection of the Earth often exhibits redundancies which one can exploit to fill in gaps caused by sampling the surveys onto the common grid. In fact, since data gaps and noise increase the rank of the tensor, seeking to recover the original data by reducing the rank (low-rank tensor-based completion) successfully fills in gaps caused by binning. The tensor nuclear norm (TNN) is defined by the tensor singular value decomposition (tSVD) which generalizes the matrix SVD. In this work we complete missing time-lapse data caused by binning using the alternating direction method of multipliers (or ADMM) to minimize the TNN. For a synthetic experiment with three parabolic events in which the time-lapse difference involves an amplitude increase in one of these events between baseline and monitor data sets, the binning and reconstruction algorithm (TNN-ADMM) correctly recovers this time-lapse change. We also apply this workflow of binning and TNN-ADMM reconstruction to a real marine survey from offshore Western Australia in which the binning onto a regular grid results in significant data gaps. The data after reconstruction varies continuously without the large gaps caused by the binning process.

    more » « less
  3. Synchrophasor data suffer from quality issues like missing and bad data. Exploiting the low-rankness of the Hankel matrix of the synchrophasor data, this paper formulates the data recovery problem as a robust low-rank Hankel matrix completion problem and proposes a Bayesian data recovery method that estimates the posterior distribution of synchrophasor data from partial observations. In contrast to the deterministic approaches, our proposed Bayesian method provides an uncertainty index to evaluate the confidence of each estimation. To the best of our knowledge, this is the first method that provides confidence measure for synchrophasor data recovery. Numerical experiments on synthetic data and recorded synchrophasor data demonstrate that our method outperforms existing low-rank matrix completion methods. 
    more » « less
  4. null (Ed.)
    Matrix completion, the problem of completing missing entries in a data matrix with low-dimensional structure (such as rank), has seen many fruitful approaches and analyses. Tensor completion is the tensor analog that attempts to impute missing tensor entries from similar low-rank type assumptions. In this paper, we study the tensor completion problem when the sampling pattern is deterministic and possibly non-uniform. We first propose an efficient weighted Higher Order Singular Value Decomposition (HOSVD) algorithm for the recovery of the underlying low-rank tensor from noisy observations and then derive the error bounds under a properly weighted metric. Additionally, the efficiency and accuracy of our algorithm are both tested using synthetic and real datasets in numerical simulations. 
    more » « less
  5. null (Ed.)
    Abstract One of the classical approaches for estimating the frequencies and damping factors in a spectrally sparse signal is the MUltiple SIgnal Classification (MUSIC) algorithm, which exploits the low-rank structure of an autocorrelation matrix. Low-rank matrices have also received considerable attention recently in the context of optimization algorithms with partial observations, and nuclear norm minimization (NNM) has been widely used as a popular heuristic of rank minimization for low-rank matrix recovery problems. On the other hand, it has been shown that NNM can be viewed as a special case of atomic norm minimization (ANM), which has achieved great success in solving line spectrum estimation problems. However, as far as we know, the general ANM (not NNM) considered in many existing works can only handle frequency estimation in undamped sinusoids. In this work, we aim to fill this gap and deal with damped spectrally sparse signal recovery problems. In particular, inspired by the dual analysis used in ANM, we offer a novel optimization-based perspective on the classical MUSIC algorithm and propose an algorithm for spectral estimation that involves searching for the peaks of the dual polynomial corresponding to a certain NNM problem, and we show that this algorithm is in fact equivalent to MUSIC itself. Building on this connection, we also extend the classical MUSIC algorithm to the missing data case. We provide exact recovery guarantees for our proposed algorithms and quantify how the sample complexity depends on the true spectral parameters. In particular, we provide a parameter-specific recovery bound for low-rank matrix recovery of jointly sparse signals rather than use certain incoherence properties as in existing literature. Simulation results also indicate that the proposed algorithms significantly outperform some relevant existing methods (e.g., ANM) in frequency estimation of damped exponentials. 
    more » « less