Consider a multi-aspect tensor dataset which is only observed in multiple complementary aggregated versions, each one at a lower resolution than the highest available one. Recent work [2] has demonstrated that given two such tensors, which have been aggregated in lower resolutions in complementary dimensions, we can pose and solve the disaggregation as an instance of a coupled tensor decomposition. In this work, we are exploring the scenario in which, in addition to the two complementary aggregated views, we also have access to a graph where nodes correspond to samples of the tensor mode that has not been aggregated. Given this graph, we propose a graph-assisted tensor disaggregation method. In our experimental evaluation,we demonstrate that our proposed method performs on par with the state of the art when the rank of the underlying coupled tensor decomposition is low, and significantly outperforms the state of the art in cases where the rank increases, producing more robust and higher-quality disaggregation.
more »
« less
Tendi: Tensor Disaggregation from Multiple Coarse Views
Multidimensional data appear in various interesting applications, e.g., sales data indexed by stores, items, and time. Oftentimes, data are observed aggregated over multiple data atoms, thus exhibit low resolution. Temporal aggregation is most common, but many datasets are also aggregated over other attributes. Multidimensional data, in particular, are sometimes available in multiple coarse views, aggregated across different dimensions – especially when sourced by different agencies. For instance, item sales can be aggregated temporally, and over groups of stores based on their location or affiliation. However, data in finer granularity significantly benefit forecasting and data analytics, prompting increasing interest in data disaggregation methods. In this paper, we propose Tendi, a principled model that efficiently disaggregates multidimensional (tensor) data from multiple views, aggregated over different dimensions. Tendi employs coupled tensor factorization to fuse the multiple views and provide recovery guarantees under realistic conditions. We also propose a variant of Tendi, called TendiB, which performs the disaggregation task without any knowledge of the aggregation mechanism. Experiments on real data from different domains demonstrate the high effectiveness of the proposed methods.
more »
« less
- Award ID(s):
- 1704074
- PAR ID:
- 10287995
- Editor(s):
- Lauw H., Wong RW.
- Date Published:
- Journal Name:
- Advances in Knowledge Discovery and Data Mining. PAKDD 2020. Lecture Notes in Computer Science, vol 12085. Springer, Cham. https://doi.org/10.1007/978-3-030-47436-2_65
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Spectrum cartography aims at estimating power propagation patterns over a geographical region across multiple frequency bands (i.e., a radio map)—from limited samples taken sparsely over the region. Classic cartography methods are mostly concerned with recovering the aggregate radio frequency (RF) information while ignoring the constituents of the radio map—but fine-grained emitter-level RF information is of great interest. In addition, many existing cartography methods explicitly or implicitly assume random spatial sampling schemes that may be difficult to implement, due to legal/privacy/security issues. The theoretical aspects (e.g., identifiability of the radio map) of many existing methods are also unclear. In this work, we propose a joint radio map recovery and disaggregation method that is based on coupled block-term tensor decomposition. Our method guarantees identifiability of the individual radio map of each emitter (thereby that of the aggregate radio map as well), under realistic conditions. The identifiability result holds under a large variety of geographical sampling patterns, including a number of pragmatic systematic sampling strategies. We also propose effective optimization algorithms to carry out the formulated radio map disaggregation problems. Extensive simulations are employed to showcase the effectiveness of the proposed approach.more » « less
-
Representing videos as linear subspaces on Grassmann manifolds has made great strides in action recognition problems. Recent studies have explored the convenience of discriminant analysis by making use of Grassmann kernels. However, traditional methods rely on the matrix representation of videos based on the temporal dimension and suffer from not considering the two spatial dimensions. To overcome this problem, we keep the natural form of videos by representing video inputs as multidimensional arrays known as tensors and propose a tensor discriminant analysis approach on Grassmannian manifolds. Because matrix algebra does not handle tensor data, we introduce a new Grassmann projection kernel based on the tensor-tensor decomposition and product. Experiments with human action databases show that the proposed method performs well compared with the state-of-the-art algorithms.more » « less
-
null (Ed.)State aggregation is a popular model reduction method rooted in optimal control. It reduces the complexity of engineering systems by mapping the system’s states into a small number of meta-states. The choice of aggregation map often depends on the data analysts’ knowledge and is largely ad hoc. In this paper, we propose a tractable algorithm that estimates the probabilistic aggregation map from the system’s trajectory. We adopt a soft-aggregation model, where each meta-state has a signature raw state, called an anchor state. This model includes several common state aggregation models as special cases. Our proposed method is a simple two- step algorithm: The first step is spectral decomposition of empirical transition matrix, and the second step conducts a linear transformation of singular vectors to find their approximate convex hull. It outputs the aggregation distributions and disaggregation distributions for each meta-state in explicit forms, which are not obtainable by classical spectral methods. On the theoretical side, we prove sharp error bounds for estimating the aggregation and disaggregation distributions and for identifying anchor states. The analysis relies on a new entry-wise deviation bound for singular vectors of the empirical transition matrix of a Markov process, which is of independent interest and cannot be deduced from existing literature. The application of our method to Manhattan traffic data successfully generates a data-driven state aggregation map with nice interpretations.more » « less
-
Data collected at very frequent intervals is usually extremely sparse and has no structure that is exploitable by modern tensor decomposition algorithms. Thus, the utility of such tensors is low, in terms of the amount of interpretable and exploitable structure that one can extract from them. In this paper, we introduce the problem of finding a tensor of adaptive aggregated granularity that can be decomposed to reveal meaningful latent concepts (structures) from datasets that, in their original form, are not amenable to tensor analysis. Such datasets fall under the broad category of sparse point processes that evolve over space and/or time. To the best of our knowledge, this is the first work that explores adaptive granularity aggregation in tensors. Furthermore, we formally define the problem and discuss different definitions of “good structure” that are in practice and show that the optimal solution is of prohibitive combinatorial complexity. Subsequently, we propose an efficient and effective greedy algorithm called I CE B REAKER , which follows a number of intuitive decision criteria that locally maximize the “goodness of structure,” resulting in high-quality tensors. We evaluate our method on synthetic, semi-synthetic, and real datasets. In all the cases, our proposed method constructs tensors that have a very high structure quality.more » « less
An official website of the United States government

