skip to main content


Title: Polytopal Complex Construction and Use in Persistent Homology
Topological Data Analysis (TDA) is a data mining technique to characterize the topological features of data. Persistent Homology (PH) is an important tool of TDA that has been applied to a wide range of applications. However its time and space complexities motivates a need for new methods to compute the PH of high-dimensional data. An important, and memory intensive, element in the computation of PH is the complex constructed from the input data. In general, PH tools use and focus on optimizing simplicial complexes; less frequently cubical complexes are also studied. This paper develops a method to construct polytopal complexes (or complexes constructed of any mix of convex polytopes) in any dimension Rn . In general, polytopal complexes are significantly smaller than simplicial or cubical complexes. This paper includes an experimental assessment of the impact that polytopal complexes have on memory complexity and output results of a PH computation.  more » « less
Award ID(s):
1909096
NSF-PAR ID:
10466293
Author(s) / Creator(s):
;
Date Published:
Journal Name:
ICDM Workshop on High Dimensional Data Mining
Page Range / eLocation ID:
634 to 641
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Topological Data Analysis (TDA) studies the “shape” of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite number of directions). In this paper, we present an algorithm for reconstructing plane graphs K = (V, E) in R^2, i.e., a planar graph with vertices in general position and a straight-line embedding, from a quadratic number height filtrations and their respective persistence diagrams. 
    more » « less
  2. Topological Data Analysis (TDA) studies the shape of data. A common topological descriptor is the persistence diagram, which encodes topological features in a topological space at different scales. Turner, Mukeherjee, and Boyer showed that one can reconstruct a simplicial complex embedded in R^3 using persistence diagrams generated from all possible height filtrations (an uncountably infinite number of directions). In this paper, we present an algorithm for reconstructing plane graphs K=(V,E) in R^2 , i.e., a planar graph with vertices in general position and a straight-line embedding, from a quadratic number height filtrations and their respective persistence diagrams. 
    more » « less
  3. Cycle representatives of persistent homology classes can be used to provide descriptions of topological features in data. However, the non-uniqueness of these representatives creates ambiguity and can lead to many different interpretations of the same set of classes. One approach to solving this problem is to optimize the choice of representative against some measure that is meaningful in the context of the data. In this work, we provide a study of the effectiveness and computational cost of several ℓ 1 minimization optimization procedures for constructing homological cycle bases for persistent homology with rational coefficients in dimension one, including uniform-weighted and length-weighted edge-loss algorithms as well as uniform-weighted and area-weighted triangle-loss algorithms. We conduct these optimizations via standard linear programming methods, applying general-purpose solvers to optimize over column bases of simplicial boundary matrices. Our key findings are: 1) optimization is effective in reducing the size of cycle representatives, though the extent of the reduction varies according to the dimension and distribution of the underlying data, 2) the computational cost of optimizing a basis of cycle representatives exceeds the cost of computing such a basis, in most data sets we consider, 3) the choice of linear solvers matters a lot to the computation time of optimizing cycles, 4) the computation time of solving an integer program is not significantly longer than the computation time of solving a linear program for most of the cycle representatives, using the Gurobi linear solver, 5) strikingly, whether requiring integer solutions or not, we almost always obtain a solution with the same cost and almost all solutions found have entries in { ‐ 1,0,1 } and therefore, are also solutions to a restricted ℓ 0 optimization problem, and 6) we obtain qualitatively different results for generators in Erdős-Rényi random clique complexes than in real-world and synthetic point cloud data. 
    more » « less
  4. Gørtz, Inge Li ; Farach-Colton, Martin ; Puglisi, Simon J. ; Herman, Grzegorz (Ed.)
    We present efficient algorithms for solving systems of linear equations in 1-Laplacians of well-shaped simplicial complexes. 1-Laplacians, or higher-dimensional Laplacians, generalize graph Laplacians to higher-dimensional simplicial complexes and play a key role in computational topology and topological data analysis. Previously, nearly-linear time solvers were developed for simplicial complexes with known collapsing sequences and bounded Betti numbers, such as those triangulating a three-ball in ℝ³ (Cohen, Fasy, Miller, Nayyeri, Peng, and Walkington [SODA'2014], Black, Maxwell, Nayyeri, and Winkelman [SODA'2022], Black and Nayyeri [ICALP'2022]). Furthermore, Nested Dissection provides quadratic time solvers for more general systems with nonzero structures representing well-shaped simplicial complexes embedded in ℝ³. We generalize the specialized solvers for 1-Laplacians to simplicial complexes with additional geometric structures but without collapsing sequences and bounded Betti numbers, and we improve the runtime of Nested Dissection. We focus on simplicial complexes that meet two conditions: (1) each individual simplex has a bounded aspect ratio, and (2) they can be divided into "disjoint" and balanced regions with well-shaped interiors and boundaries. Our solvers draw inspiration from the Incomplete Nested Dissection for stiffness matrices of well-shaped trusses (Kyng, Peng, Schwieterman, and Zhang [STOC'2018]). 
    more » « less
  5. Over the last two decades, topological data analysis (TDA) has emerged as a very powerful data analytic approach that can deal with various data modalities of varying complexities. One of the most commonly used tools in TDA is persistent homology (PH), which can extract topological properties from data at various scales. The aim of this article is to introduce TDA concepts to a statistical audience and provide an approach to analyzing multivariate time series data. The application’s focus will be on multivariate brain signals and brain connectivity networks. Finally, this paper concludes with an overview of some open problems and potential application of TDA to modeling directionality in a brain network, as well as the casting of TDA in the context of mixed effect models to capture variations in the topological properties of data collected from multiple subjects.

     
    more » « less