skip to main content

Title: Options for multimodal classification based on L1-Tucker decomposition
Most commonly used classification algorithms process data in the form of vectors. At the same time, modern datasets often comprise multimodal measurements that are naturally modeled as multi-way arrays, also known as tensors. Processing multi-way data in their tensor form can enable enhanced inference and classification accuracy. Tucker decomposition is a standard method for tensor data processing, which however has demonstrated severe sensitivity to corrupted measurements due to its L2-norm formulation. In this work, we present a selection of classification methods that employ an L1-norm-based, corruption-resistant reformulation of Tucker (L1-Tucker). Our experimental studies on multiple real datasets corroborate the corruption-resistance and classification accuracy afforded by L1-Tucker.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
SPIE DCS 2019, vol. 10989
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Event detection is gaining increasing attention in smart cities research. Large-scale mobility data serves as an important tool to uncover the dynamics of urban transportation systems, and more often than not the dataset is incomplete. In this article, we develop a method to detect extreme events in large traffic datasets, and to impute missing data during regular conditions. Specifically, we propose a robust tensor recovery problem to recover low-rank tensors under fiber-sparse corruptions with partial observations, and use it to identify events, and impute missing data under typical conditions. Our approach is scalable to large urban areas, taking full advantage of the spatio-temporal correlations in traffic patterns. We develop an efficient algorithm to solve the tensor recovery problem based on the alternating direction method of multipliers (ADMM) framework. Compared with existing l 1 norm regularized tensor decomposition methods, our algorithm can exactly recover the values of uncorrupted fibers of a low-rank tensor and find the positions of corrupted fibers under mild conditions. Numerical experiments illustrate that our algorithm can achieve exact recovery and outlier detection even with missing data rates as high as 40% under 5% gross corruption, depending on the tensor size and the Tucker rank of the low rank tensor. Finally, we apply our method on a real traffic dataset corresponding to downtown Nashville, TN and successfully detect the events like severe car crashes, construction lane closures, and other large events that cause significant traffic disruptions. 
    more » « less
  2. Markopoulos, Panos P. ; Ouyang, Bing (Ed.)
    We consider the problem of unsupervised (blind) evaluation and assessment of the quality of data used for deep neural network (DNN) RF signal classification. When neural networks train on noisy or mislabeled data, they often (over-)fit to the noise measurements and faulty labels, which leads to significant performance degradation. Also, DNNs are vulnerable to adversarial attacks, which can considerably reduce their classification performance, with extremely small perturbations of their input. In this paper, we consider a new method based on L1-norm principal-component analysis (PCA) to improve the quality of labeled wireless data sets that are used for training a convolutional neural network (CNN), and a deep residual network (ResNet) for RF signal classification. Experiments with data generated for eleven classes of digital and analog modulated signals show that L1-norm tensor conformity curation of the data identifies and removes from the training data set inappropriate class instances that appear due to mislabeling and universal black-box adversarial attacks and drastically improves/restores the classification accuracy of the identified deep neural network architectures. 
    more » « less
  3. In the past few decades, there has been rapid growth in quantity and variety of healthcare data. These large sets of data are usually high dimensional (e.g. patients, their diagnoses, and medications to treat their diagnoses) and cannot be adequately represented as matrices. Thus, many existing algorithms can not analyze them. To accommodate these high dimensional data, tensor factorization, which can be viewed as a higher-order extension of methods like PCA, has attracted much attention and emerged as a promising solution. However, tensor factorization is a computationally expensive task, and existing methods developed to factor large tensors are not flexible enough for real-world situations. To address this scaling problem more efficiently, we introduce SGranite, a distributed, scalable, and sparse tensor factorization method fit through stochastic gradient descent. SGranite offers three contributions: (1) Scalability: it employs a block partitioning and parallel processing design and thus scales to large tensors, (2) Accuracy: we show that our method can achieve results faster without sacrificing the quality of the tensor decomposition, and (3) FlexibleConstraints: we show our approach can encompass various kinds of constraints including l2 norm, l1 norm, and logistic regularization. We demonstrate SGranite's capabilities in two real-world use cases. In the first, we use Google searches for flu-like symptoms to characterize and predict influenza patterns. In the second, we use SGranite to extract clinically interesting sets (i.e., phenotypes) of patients from electronic health records. Through these case studies, we show SGranite has the potential to be used to rapidly characterize, predict, and manage a large multimodal datasets, thereby promising a novel, data-driven solution that can benefit very large segments of the population. 
    more » « less
  4. Tucker decomposition is a low-rank tensor approximation that generalizes a truncated matrix singular value decomposition (SVD). Existing parallel software has shown that Tucker decomposition is particularly effective at compressing terabyte-sized multidimensional scientific simulation datasets, computing reduced representations that satisfy a specified approximation error. The general approach is to get a low-rank approximation of the input data by performing a sequence of matrix SVDs of tensor unfoldings, which tend to be short-fat matrices. In the existing approach, the SVD is performed by computing the eigendecomposition of the Gram matrix of the unfolding. This method sacrifices some numerical stability in exchange for lower computation costs and easier parallelization. We propose using a more numerically stable though more computationally expensive way to compute the SVD by pre- processing with a QR decomposition step and computing an SVD of only the small triangular factor. The more numerically stable approach allows us to achieve the same accuracy with half the working precision (for example, single rather than double precision). We demonstrate that our method scales as well as the existing approach, and the use of lower precision leads to an overall reduction in running time of up to a factor of 2 when using 10s to 1000s of processors. Using the same working precision, we are also able to compute Tucker decompositions with much smaller approximation error. 
    more » « less
  5. Linear encoding of sparse vectors is widely popular, but is commonly data-independent – missing any possible extra (but a priori unknown) structure beyond sparsity. In this paper we present a new method to learn linear encoders that adapt to data, while still performing well with the widely used l1 decoder. The convex l1 decoder prevents gradient propagation as needed in standard gradient-based training. Our method is based on the insight that unrolling the convex decoder into T projected subgradient steps can address this issue. Our method can be seen as a data-driven way to learn a compressed sensing measurement matrix. We compare the empirical performance of 10 algorithms over 6 sparse datasets (3 synthetic and 3 real). Our experiments show that there is indeed additional structure beyond sparsity in the real datasets; our method is able to discover it and exploit it to create excellent reconstructions with fewer measurements (by a factor of 1.1-3x) compared to the previous state-of-the-art methods. We illustrate an application of our method in learning label embeddings for extreme multi-label classification, and empirically show that our method is able to match or outperform the precision scores of SLEEC, which is one of the state-of-the-art embedding-based approaches. 
    more » « less