skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Stochastic Optimization for Coupled Tensor Decomposition with Applications in Statistical Learning
Coupled tensor decomposition aims at factoring a number of tensors that share some of their latent factors. Existing algorithms for coupled canonical polyadic decomposition (CPD) face serious scalablity challenges, especially when the number of tensors is large. However, a large amount of coupled tensors naturally arise in timely applications such as statistical learning, e.g., when estimating the joint probability mass function (PMF) of many random variables from marginal PMFs. Stochastic algorithms that admit lightweight updates exist for coupled decomposition, but these algorithms cannot handle complex constraints (e.g., the probability simplex constraint that is important in statistical learning) due to their sampling patterns. This work puts forth a simple data-sampling and block variable-updating strategy for simultaneously factoring a large number of coupled tensors. The proposed algorithm enjoys low per-iteration complexity and can easily handle constraints on latent factors. We also show that this multi-block algorithm admits a nice connection to the classic single-block stochastic proximal gradient (SPG), and thus it naturally inherits convergence properties of SPG. Synthetic and real-data experiments show that the proposed algorithm is very promising for statistical learning problems.  more » « less
Award ID(s):
1808159
PAR ID:
10105977
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2019 IEEE Data Science Workshop (DSW)
Page Range / eLocation ID:
300 to 304
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This work focuses on canonical polyadic decomposition (CPD) for large-scale tensors. Many prior works rely on data sparsity to develop scalable CPD algorithms, which are not suitable for handling dense tensor, while dense tensors often arise in applications such as image and video processing. As an alternative, stochastic algorithms utilize data sampling to reduce per-iteration complexity and thus are very scalable, even when handling dense tensors. However, existing stochastic CPD algorithms are facing some challenges. For example, some algorithms are based on randomly sampled tensor entries, and thus each iteration can only updates a small portion of the latent factors. This may result in slow improvement of the estimation accuracy of the latent factors. In addition, the convergence properties of many stochastic CPD algorithms are unclear, perhaps because CPD poses a hard nonconvex problem and is challenging for analysis under stochastic settings. In this work, we propose a stochastic optimization strategy that can effectively circumvent the above challenges. The proposed algorithm updates a whole latent factor at each iteration using sampled fibers of a tensor, which can quickly increase the estimation accuracy. The algorithm is flexible-many commonly used regularizers and constraints can be easily incorporated in the computational framework. The algorithm is also backed by a rigorous convergence theory. Simulations on large-scale dense tensors are employed to showcase the effectiveness of the algorithm. 
    more » « less
  2. null (Ed.)
    Canonical polyadic decomposition (CPD) has been a workhorse for multimodal data analytics. This work puts forth a stochastic algorithmic framework for CPD under β-divergence, which is well-motivated in statistical learning—where the Euclidean distance is typically not preferred. Despite the existence of a series of prior works addressing this topic, pressing computational and theoretical challenges, e.g., scalability and convergence issues, still remain. In this paper, a unified stochastic mirror descent framework is developed for large-scale β-divergence CPD. Our key contribution is the integrated design of a tensor fiber sampling strategy and a flexible stochastic Bregman divergence-based mirror descent iterative procedure, which significantly reduces the computation and memory cost per iteration for various β. Leveraging the fiber sampling scheme and the multilinear algebraic structure of low-rank tensors, the proposed lightweight algorithm also ensures global convergence to a stationary point under mild conditions. Numerical results on synthetic and real data show that our framework attains significant computational saving compared with state-of-the-art methods. 
    more » « less
  3. We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a lowtubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework. We show the t-SVD is a specialization of the well-studied block-term decomposition for third-order tensors, and we present an algorithm under this model that can track changing free submodules from incomplete streaming 2-D data. The proposed algorithm uses principles from incremental gradient descent on the Grassmann manifold of subspaces to solve the tensor completion problem with linear complexity and constant memory in the number of time samples. We provide a local expected linear convergence result for our algorithm. Our empirical results are competitive in accuracy but much faster in compute time than state-of-the-art tensor completion algorithms on real applications to recover temporal chemo-sensing and MRI data under limited sampling. 
    more » « less
  4. Abstract Decomposition‐based solution algorithms for optimization problems depend on the underlying latent block structure of the problem. Methods for detecting this structure are currently lacking. In this article, we propose stochastic blockmodeling (SBM) as a systematic framework for learning the underlying block structure in generic optimization problems. SBM is a generative graph model in which nodes belong to some blocks and the interconnections among the nodes are stochastically dependent on their block affiliations. Hence, through parametric statistical inference, the interconnection patterns underlying optimization problems can be estimated. For benchmark optimization problems, we show that SBM can reveal the underlying block structure and that the estimated blocks can be used as the basis for decomposition‐based solution algorithms which can reach an optimum or bound estimates in reduced computational time. Finally, we present a general software platform for automated block structure detection and decomposition‐based solution following distributed and hierarchical optimization approaches. 
    more » « less
  5. Multi-instance learning (MIL) is an area of machine learning that handles data that is organized into sets of instances known as bags. Traditionally, MIL is used in the supervised-learning setting and is able to classify bags which can contain any number of instances. This property allows MIL to be naturally applied to solve the problems in a wide variety of real-world applications from computer vision to healthcare. However, many traditional MIL algorithms do not scale efficiently to large datasets. In this paper we present a novel Primal-Dual Multi-Instance Support Vector Machine (pdMISVM) derivation and implementation that can operate efficiently on large scale data. Our method relies on an algorithm derived using a multi-block variation of the alternating direction method of multipliers (ADMM). The approach presented in this work is able to scale to large-scale data since it avoids iteratively solving quadratic programming problems which are generally used to optimize MIL algorithms based on SVMs. In addition, we modify our derivation to include an additional optimization designed to avoid solving a least-squares problem during our algorithm; this optimization increases the utility of our approach to handle a large number of features as well as bags. Finally, we apply our approach to synthetic and real-world multi-instance datasets to illustrate the scalability, promising predictive performance, and interpretability of our proposed method. We end our discussion with an extension of our approach to handle non-linear decision boundaries. Code and data for our methods are available online at: https://github.com/minds-mines/pdMISVM.jl. 
    more » « less