skip to main content

ISLET: Fast and Optimal Low-Rank Tensor Regression via Importance Sketching
In this paper, we develop a novel procedure for low-rank tensor regression, namely Importance Sketching Low-rank Estimation for Tensors (ISLET). The central idea behind ISLET is importance sketching, i.e., carefully designed sketches based on both the responses and low-dimensional structure of the parameter of interest. We show that the proposed method is sharply minimax optimal in terms of the mean-squared error under low-rank Tucker assumptions and under the randomized Gaussian ensemble design. In addition, if a tensor is low-rank with group sparsity, our procedure also achieves minimax optimality. Further, we show through numerical study that ISLET achieves comparable or better mean-squared error performance to existing state-of-the-art methods while having substantial storage and run-time advantages including capabilities for parallel and distributed computing. In particular, our procedure performs reliable estimation with tensors of dimension $p = O(10^8)$ and is 1 or 2 orders of magnitude faster than baseline methods.
Authors:
; ; ;
Award ID(s):
Publication Date:
NSF-PAR ID:
10164437
Journal Name:
SIAM journal on mathematics of data science
Volume:
2
Issue:
2
Page Range or eLocation-ID:
444-479
ISSN:
2577-0187
Sponsoring Org:
National Science Foundation
##### More Like this
1. Real-world spatio-temporal data is often incomplete or inaccurate due to various data loading delays. For example, a location-disease-time tensor of case counts can have multiple delayed updates of recent temporal slices for some locations or diseases. Recovering such missing or noisy (under-reported) elements of the input tensor can be viewed as a generalized tensor completion problem. Existing tensor completion methods usually assume that i) missing elements are randomly distributed and ii) noise for each tensor element is i.i.d. zero-mean. Both assumptions can be violated for spatio-temporal tensor data. We often observe multiple versions of the input tensor with different under-reporting noise levels. The amount of noise can be time- or location-dependent as more updates are progressively introduced to the tensor. We model such dynamic data as a multi-version tensor with an extra tensor mode capturing the data updates. We propose a low-rank tensor model to predict the updates over time. We demonstrate that our method can accurately predict the ground-truth values of many real-world tensors. We obtain up to 27.2% lower root mean-squared-error compared to the best baseline method. Finally, we extend our method to track the tensor data over time, leading to significant computational savings.

2. Motivated by the many real-world applications of reinforcement learning (RL) that require safe-policy iterations, we consider the problem of off-policy evaluation (OPE) — the problem of evaluating a new policy using the historical data ob- tained by different behavior policies — under the model of nonstationary episodic Markov Decision Processes (MDP) with a long horizon and a large action space. Existing importance sampling (IS) methods often suffer from large variance that depends exponentially on the RL horizon H. To solve this problem, we consider a marginalized importance sampling (MIS) estimator that recursively estimates the state marginal distribution for the target policy at every step. MIS achieves a mean-squared error of [ ] where μ and π are the logging and target policies, dμt (st) and dπt (st) are the marginal distribution of the state at tth step, H is the horizon, n is the sample size and V π is the value function of the MDP under π. The result matches the t+1 Cramer-Rao lower bound in Jiang and Li [2016] up to a multiplicative factor of H. To the best of our knowledge, this is the first OPE estimation error bound with a polynomial dependence on H . Besidesmore »
3. Abstract Higher-order tensors can represent scores in a rating system, frames in a video, and images of the same subject. In practice, the measurements are often highly quantized due to the sampling strategies or the quality of devices. Existing works on tensor recovery have focused on data losses and random noises. Only a few works consider tensor recovery from quantized measurements but are restricted to binary measurements. This paper, for the first time, addresses the problem of tensor recovery from multi-level quantized measurements by leveraging the low CANDECOMP/PARAFAC (CP) rank property. We study the recovery of both general low-rank tensors and tensors that have tensor singular value decomposition (TSVD) by solving nonconvex optimization problems. We provide the theoretical upper bounds of the recovery error, which diminish to zero when the sizes of dimensions increase to infinity. We further characterize the fundamental limit of any recovery algorithm and show that our recovery error is nearly order-wise optimal. A tensor-based alternating proximal gradient descent algorithm with a convergence guarantee and a TSVD-based projected gradient descent algorithm are proposed to solve the nonconvex problems. Our recovery methods can also handle data losses and do not necessarily need the information of the quantization rule.more »
4. We study a noisy tensor completion problem of broad practical interest, namely, the reconstruction of a low-rank tensor from highly incomplete and randomly corrupted observations of its entries. Whereas a variety of prior work has been dedicated to this problem, prior algorithms either are computationally too expensive for large-scale applications or come with suboptimal statistical guarantees. Focusing on “incoherent” and well-conditioned tensors of a constant canonical polyadic rank, we propose a two-stage nonconvex algorithm—(vanilla) gradient descent following a rough initialization—that achieves the best of both worlds. Specifically, the proposed nonconvex algorithm faithfully completes the tensor and retrieves all individual tensor factors within nearly linear time, while at the same time enjoying near-optimal statistical guarantees (i.e., minimal sample complexity and optimal estimation accuracy). The estimation errors are evenly spread out across all entries, thus achieving optimal [Formula: see text] statistical accuracy. We also discuss how to extend our approach to accommodate asymmetric tensors. The insight conveyed through our analysis of nonconvex optimization might have implications for other tensor estimation problems.
5. We consider the problem of decomposing a higher-order tensor with binary entries. Such data problems arise frequently in applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose a multilinear Bernoulli model, develop a rank-constrained likelihood-based estimation method, and obtain the theoretical accuracy guarantees. In contrast to continuous-valued problems, the binary tensor problem exhibits an interesting phase transition phenomenon according to the signal-to-noise ratio. The error bound for the parameter tensor estimation is established, and we show that the obtained rate is minimax optimal under the considered model. Furthermore, we develop an alternating optimization algorithm with convergence guarantees. The efficacy of our approach is demonstrated through both simulations and analyses of multiple data sets on the tasks of tensor completion and clustering.