- Award ID(s):
- 1808159
- PAR ID:
- 10105977
- Date Published:
- Journal Name:
- 2019 IEEE Data Science Workshop (DSW)
- Page Range / eLocation ID:
- 300 to 304
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
This work focuses on canonical polyadic decomposition (CPD) for large-scale tensors. Many prior works rely on data sparsity to develop scalable CPD algorithms, which are not suitable for handling dense tensor, while dense tensors often arise in applications such as image and video processing. As an alternative, stochastic algorithms utilize data sampling to reduce per-iteration complexity and thus are very scalable, even when handling dense tensors. However, existing stochastic CPD algorithms are facing some challenges. For example, some algorithms are based on randomly sampled tensor entries, and thus each iteration can only updates a small portion of the latent factors. This may result in slow improvement of the estimation accuracy of the latent factors. In addition, the convergence properties of many stochastic CPD algorithms are unclear, perhaps because CPD poses a hard nonconvex problem and is challenging for analysis under stochastic settings. In this work, we propose a stochastic optimization strategy that can effectively circumvent the above challenges. The proposed algorithm updates a whole latent factor at each iteration using sampled fibers of a tensor, which can quickly increase the estimation accuracy. The algorithm is flexible-many commonly used regularizers and constraints can be easily incorporated in the computational framework. The algorithm is also backed by a rigorous convergence theory. Simulations on large-scale dense tensors are employed to showcase the effectiveness of the algorithm.more » « less
-
null (Ed.)Canonical polyadic decomposition (CPD) has been a workhorse for multimodal data analytics. This work puts forth a stochastic algorithmic framework for CPD under β-divergence, which is well-motivated in statistical learning—where the Euclidean distance is typically not preferred. Despite the existence of a series of prior works addressing this topic, pressing computational and theoretical challenges, e.g., scalability and convergence issues, still remain. In this paper, a unified stochastic mirror descent framework is developed for large-scale β-divergence CPD. Our key contribution is the integrated design of a tensor fiber sampling strategy and a flexible stochastic Bregman divergence-based mirror descent iterative procedure, which significantly reduces the computation and memory cost per iteration for various β. Leveraging the fiber sampling scheme and the multilinear algebraic structure of low-rank tensors, the proposed lightweight algorithm also ensures global convergence to a stationary point under mild conditions. Numerical results on synthetic and real data show that our framework attains significant computational saving compared with state-of-the-art methods.more » « less
-
We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a lowtubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework. We show the t-SVD is a specialization of the well-studied block-term decomposition for third-order tensors, and we present an algorithm under this model that can track changing free submodules from incomplete streaming 2-D data. The proposed algorithm uses principles from incremental gradient descent on the Grassmann manifold of subspaces to solve the tensor completion problem with linear complexity and constant memory in the number of time samples. We provide a local expected linear convergence result for our algorithm. Our empirical results are competitive in accuracy but much faster in compute time than state-of-the-art tensor completion algorithms on real applications to recover temporal chemo-sensing and MRI data under limited sampling.more » « less
-
Abstract Decomposition‐based solution algorithms for optimization problems depend on the underlying latent block structure of the problem. Methods for detecting this structure are currently lacking. In this article, we propose stochastic blockmodeling (SBM) as a systematic framework for learning the underlying block structure in generic optimization problems. SBM is a generative graph model in which nodes belong to some blocks and the interconnections among the nodes are stochastically dependent on their block affiliations. Hence, through parametric statistical inference, the interconnection patterns underlying optimization problems can be estimated. For benchmark optimization problems, we show that SBM can reveal the underlying block structure and that the estimated blocks can be used as the basis for decomposition‐based solution algorithms which can reach an optimum or bound estimates in reduced computational time. Finally, we present a general software platform for automated block structure detection and decomposition‐based solution following distributed and hierarchical optimization approaches.
-
Bach, Francis ; Blei, David ; Scholkopf, Bernhard (Ed.)This paper investigates the asymptotic behaviors of gradient descent algorithms (particularly accelerated gradient descent and stochastic gradient descent) in the context of stochastic optimization arising in statistics and machine learning, where objective functions are estimated from available data. We show that these algorithms can be computationally modeled by continuous-time ordinary or stochastic differential equations. We establish gradient flow central limit theorems to describe the limiting dynamic behaviors of these computational algorithms and the large-sample performances of the related statistical procedures, as the number of algorithm iterations and data size both go to infinity, where the gradient flow central limit theorems are governed by some linear ordinary or stochastic differential equations, like time-dependent Ornstein-Uhlenbeck processes. We illustrate that our study can provide a novel unified framework for a joint computational and statistical asymptotic analysis, where the computational asymptotic analysis studies the dynamic behaviors of these algorithms with time (or the number of iterations in the algorithms), the statistical asymptotic analysis investigates the large-sample behaviors of the statistical procedures (like estimators and classifiers) that are computed by applying the algorithms; in fact, the statistical procedures are equal to the limits of the random sequences generated from these iterative algorithms, as the number of iterations goes to infinity. The joint analysis results based on the obtained gradient flow central limit theorems lead to the identification of four factors—learning rate, batch size, gradient covariance, and Hessian—to derive new theories regarding the local minima found by stochastic gradient descent for solving non-convex optimization problems.more » « less