We propose an algorithm to impute and forecast a time series by transforming the observed time series into a matrix, utilizing matrix estimation to recover missing values and denoise observed entries, and performing linear regression to make predictions. At the core of our analysis is a representation result, which states that for a large class of models, the transformed time series matrix is (approximately) lowrank. In effect, this generalizes the widely used Singular Spectrum Analysis (SSA) in the time series literature, and allows us to establish a rigorous link between time series analysis and matrix estimation. The key to establishing this link is constructing a Page matrix with nonoverlapping entries rather than a Hankel matrix as is commonly done in the literature (e.g., SSA). This particular matrix structure allows us to provide finite sample analysis for imputation and prediction, and prove the asymptotic consistency of our method. Another salient feature of our algorithm is that it is model agnostic with respect to both the underlying time dynamics and the noise distribution in the observations. The noise agnostic property of our approach allows us to recover the latent states when only given access to noisy and partial observations a la amore »
Recovering Structured Probability Matrices.
We consider the problem of accurately recovering a matrix B of size M by M, which represents a probability distribution over M^2 outcomes, given access to an observed matrix of "counts" generated by taking independent samples from the distribution B. How can structural properties of the underlying matrix B be leveraged to yield computationally efficient and information theoretically optimal reconstruction algorithms? When can accurate reconstruction be accomplished in the sparse data regime? This basic problem lies at the core of a number of questions that are currently being considered by different communities, including building recommendation systems and collaborative filtering in the sparse data regime, community detection in sparse random graphs, learning structured models such as topic models or hidden Markov models, and the efforts from the natural language processing community to compute "word embeddings". Many aspects of this problemboth in terms of learning and property testing/estimation and on both the algorithmic and information theoretic sidesremain open. Our results apply to the setting where B has a low rank structure. For this setting, we propose an efficient (and practically viable) algorithm that accurately recovers the underlying M by M matrix using O(M) samples} (where we assume the rank is a constant). more »
 Publication Date:
 NSFPAR ID:
 10061983
 Journal Name:
 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)
 Volume:
 94
 Page Range or eLocationID:
 46:146:14
 ISSN:
 18688969
 Sponsoring Org:
 National Science Foundation
More Like this


We propose an algorithm to impute and forecast a time series by transforming the observed time series into a matrix, utilizing matrix estimation to recover missing values and denoise observed entries, and performing linear regression to make predictions. At the core of our analysis is a representation result, which states that for a large class of models, the transformed time series matrix is (approximately) lowrank. In effect, this generalizes the widely used Singular Spectrum Analysis (SSA) in the time series literature, and allows us to establish a rigorous link between time series analysis and matrix estimation. The key to establishing this link is constructing a Page matrix with nonoverlapping entries rather than a Hankel matrix as is commonly done in the literature (e.g., SSA). This particular matrix structure allows us to provide finite sample analysis for imputation and prediction, and prove the asymptotic consistency of our method. Another salient feature of our algorithm is that it is model agnostic with respect to both the underlying time dynamics and the noise distribution in the observations. The noise agnostic property of our approach allows us to recover the latent states when only given access to noisy and partial observations a la amore »

We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution p over n elements, an explicitly given distribution q, and parameters 0< epsilon, delta < 1, we wish to distinguish, with probability at least 1delta, whether the distributions are identical versus epsilonfar in total variation distance. Most prior work focused on the case that delta = Omega(1), for which the sample complexity of identity testing is known to be Theta(sqrt{n}/epsilon^2). Given such an algorithm, one can achieve arbitrarily small values of delta via blackbox amplification, which multiplies the required number of samples by Theta(log(1/delta)). We show that blackbox amplification is suboptimal for any delta = o(1), and give a new identity tester that achieves the optimal sample complexity. Our new upper and lower bounds show that the optimal sample complexity of identity testing is Theta((1/epsilon^2) (sqrt{n log(1/delta)} + log(1/delta))) for any n, epsilon, and delta. For the special case of uniformity testing, where the given distribution is the uniform distribution U_n over the domain, our new tester is surprisingly simple: to test whether p = U_n versus d_{TV} (p, U_n) >= epsilon, wemore »

Can linear systems be solved faster than matrix multiplication? While there has been remarkable progress for the special cases of graph structured linear systems, in the general setting, the bit complexity of solving an $n \times n$ linear system $Ax=b$ is $\tilde{O}(n^\omega)$, where $\omega < 2.372864$ is the matrix multiplication exponent. Improving on this has been an open problem even for sparse linear systems with poly$(n)$ condition number. In this paper, we present an algorithm that solves linear systems in sparse matrices asymptotically faster than matrix multiplication for any $\omega > 2$. This speedup holds for any input matrix $A$ with $o(n^{\omega 1}/\log(\kappa(A)))$ nonzeros, where $\kappa(A)$ is the condition number of $A$. For poly$(n)$conditioned matrices with $\tilde{O}(n)$ nonzeros, and the current value of $\omega$, the bit complexity of our algorithm to solve to within any $1/\text{poly}(n)$ error is $O(n^{2.331645})$. Our algorithm can be viewed as an efficient, randomized implementation of the block Krylov method via recursive low displacement rank factorizations. It is inspired by the algorithm of [Eberly et al. ISSAC `06 `07] for inverting matrices over finite fields. In our analysis of numerical stability, we develop matrix anticoncentration techniques to bound the smallest eigenvalue and the smallest gap inmore »

We consider sparse matrix estimation where the goal is to estimate an nbyn matrix from noisy observations of a small subset of its entries. We analyze the estimation error of the popularly used collaborative filtering algorithm for the sparse regime. Specifically, we propose a novel iterative variant of the algorithm, adapted to handle the setting of sparse observations. We establish that as long as the number of entries observed at random scale logarithmically larger than linear in n, the estimation error with respect to the entrywise max norm decays to zero as n goes to infinity, assuming the underlying matrix of interest has constant rank r. Our result is robust to model misspecification in that if the underlying matrix is approximately rank r, then the estimation error decays to the approximation error with respect to the [Formula: see text]norm. In the process, we establish the algorithm’s ability to handle arbitrary bounded noise in the observations.