skip to main content


Title: Persistence Homology of Proximity Hyper-Graphs for Higher Dimensional Big Data
Persistent Homology (PH) is a method of Topological Data Analysis that analyzes the topological structure of data to help data scientists infer relationships in the data to assist in informed decision- making. A significant component in the computation of PH is the construction and use of a complex that represents the topological structure of the data. Some complex types are fast to construct but space inefficient whereas others are costly to construct and space efficient. Unfortunately, existing complex types are not both fast to construct and compact. This paper works to increase the scope of PH to support the computation of low dimensional homologies (H0 –H10 ) in high-dimension, big data. In particular, this paper exploits the desirable properties of the Vietoris–Rips Complex (VR-Complex) and the Delaunay Complex in order to construct a sparsified complex. The VR-Complex uses a distance matrix to quickly generate a complex up to the desired homology dimension. In contrast, the Delaunay Complex works at the dimensionality of the data to generate a sparsified complex. While construction of the VR-Complex is fast, its size grows exponentially by the size and dimension of the data set; in contrast, the Delaunay complex is significantly smaller for any given data dimension. However, its construction requires the computation of a Delaunay Triangulation that has high computational complexity. As a result, it is difficult to construct a Delaunay Complex for data in dimensions d > 6 that contains more than a few hundred points. The techniques in this paper enable the computation of topological preserving sparsification of k-Simplices (where k ≪ d) to quickly generate a reduced sparsified complex sufficient to compute homologies up to k-subspace, irrespective of the data dimensionality d.  more » « less
Award ID(s):
1909096
NSF-PAR ID:
10466297
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE International Conference on Big Data
Page Range / eLocation ID:
65 to 74
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Topological Data Analysis (TDA) is a data mining technique to characterize the topological features of data. Persistent Homology (PH) is an important tool of TDA that has been applied to a wide range of applications. However its time and space complexities motivates a need for new methods to compute the PH of high-dimensional data. An important, and memory intensive, element in the computation of PH is the complex constructed from the input data. In general, PH tools use and focus on optimizing simplicial complexes; less frequently cubical complexes are also studied. This paper develops a method to construct polytopal complexes (or complexes constructed of any mix of convex polytopes) in any dimension Rn . In general, polytopal complexes are significantly smaller than simplicial or cubical complexes. This paper includes an experimental assessment of the impact that polytopal complexes have on memory complexity and output results of a PH computation. 
    more » « less
  2. Persistent Homology (PH) is computationally expensive and is thus generally employed with strict limits on the (i) maximum connectivity distance and (ii) dimensions of homology groups to compute (unless working with trivially small data sets). As a result, most studies with PH only work with H0 and H1 homology groups. This paper examines the identification and isolation of regions of data sets where high dimensional topological features are suspected to be located. These regions are analyzed with PH to characterize the high dimensional homology groups contained in that region. Since only the region around a suspected topological feature is analyzed, it is possible to identify high dimension homologies piecewise and then assemble the results into a scalable characterization of the original data set. 
    more » « less
  3. Abstract

    We study a family of invariants of compact metric spaces that combines the Curvature Sets defined by Gromov in the 1980 s with Vietoris–Rips Persistent Homology. For given integers$$k\ge 0$$k0and$$n\ge 1$$n1we consider the dimensionkVietoris–Rips persistence diagrams ofallsubsets of a given metric space with cardinality at mostn. We call these invariantspersistence setsand denote them as$${\textbf{D}}_{n,k}^{\textrm{VR}}$$Dn,kVR. We first point out that this family encompasses the usual Vietoris–Rips diagrams. We then establish that (1) for certain range of values of the parametersnandk, computing these invariants is significantly more efficient than computing the usual Vietoris–Rips persistence diagrams, (2) these invariants have very good discriminating power and, in many cases, capture information that is imperceptible through standard Vietoris–Rips persistence diagrams, and (3) they enjoy stability properties analogous to those of the usual Vietoris–Rips persistence diagrams. We precisely characterize some of them in the case of spheres and surfaces with constant curvature using a generalization of Ptolemy’s inequality. We also identify a rich family of metric graphs for which$${\textbf{D}}_{4,1}^{\textrm{VR}}$$D4,1VRfully recovers their homotopy type by studying split-metric decompositions. Along the way we prove some useful properties of Vietoris–Rips persistence diagrams using Mayer–Vietoris sequences. These yield a geometric algorithm for computing the Vietoris–Rips persistence diagram of a spaceXwith cardinality$$2k+2$$2k+2with quadratic time complexity as opposed to the much higher cost incurred by the usual algebraic algorithms relying on matrix reduction.

     
    more » « less
  4. Buchin, Kevin and (Ed.)
    We show how a filtration of Delaunay complexes can be used to approximate the persistence diagram of the distance to a point set in ℝ^d. Whereas the full Delaunay complex can be used to compute this persistence diagram exactly, it may have size O(n^⌈d/2⌉). In contrast, our construction uses only O(n) simplices. The central idea is to connect Delaunay complexes on progressively denser subsamples by considering the flips in an incremental construction as simplices in d+1 dimensions. This approach leads to a very simple and straightforward proof of correctness in geometric terms, because the final filtration is dual to a (d+1)-dimensional Voronoi construction similar to the standard Delaunay filtration. We also, show how this complex can be efficiently constructed. 
    more » « less
  5. null (Ed.)
    Abstract In this paper, we introduce and study representation homology of topological spaces, which is a natural homological extension of representation varieties of fundamental groups. We give an elementary construction of representation homology parallel to the Loday–Pirashvili construction of higher Hochschild homology; in fact, we establish a direct geometric relation between the two theories by proving that the representation homology of the suspension of a (pointed connected) space is isomorphic to its higher Hochschild homology. We also construct some natural maps and spectral sequences relating representation homology to other homology theories associated with spaces (such as Pontryagin algebras, ${{\mathbb{S}}}^1$-equivariant homology of the free loop space, and stable homology of automorphism groups of f.g. free groups). We compute representation homology explicitly (in terms of known invariants) in a number of interesting cases, including spheres, suspensions, complex projective spaces, Riemann surfaces, and some 3-dimensional manifolds, such as link complements in ${\mathbb{R}}^3$ and the lens spaces $ L(p,q) $. In the case of link complements, we identify the representation homology in terms of ordinary Hochschild homology, which gives a new algebraic invariant of links in ${\mathbb{R}}^3$. 
    more » « less