skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Minimal Cycle Representatives in Persistent Homology Using Linear Programming: An Empirical Study With User’s Guide
Cycle representatives of persistent homology classes can be used to provide descriptions of topological features in data. However, the non-uniqueness of these representatives creates ambiguity and can lead to many different interpretations of the same set of classes. One approach to solving this problem is to optimize the choice of representative against some measure that is meaningful in the context of the data. In this work, we provide a study of the effectiveness and computational cost of several ℓ 1 minimization optimization procedures for constructing homological cycle bases for persistent homology with rational coefficients in dimension one, including uniform-weighted and length-weighted edge-loss algorithms as well as uniform-weighted and area-weighted triangle-loss algorithms. We conduct these optimizations via standard linear programming methods, applying general-purpose solvers to optimize over column bases of simplicial boundary matrices. Our key findings are: 1) optimization is effective in reducing the size of cycle representatives, though the extent of the reduction varies according to the dimension and distribution of the underlying data, 2) the computational cost of optimizing a basis of cycle representatives exceeds the cost of computing such a basis, in most data sets we consider, 3) the choice of linear solvers matters a lot to the computation time of optimizing cycles, 4) the computation time of solving an integer program is not significantly longer than the computation time of solving a linear program for most of the cycle representatives, using the Gurobi linear solver, 5) strikingly, whether requiring integer solutions or not, we almost always obtain a solution with the same cost and almost all solutions found have entries in { ‐ 1,0,1 } and therefore, are also solutions to a restricted ℓ 0 optimization problem, and 6) we obtain qualitatively different results for generators in Erdős-Rényi random clique complexes than in real-world and synthetic point cloud data.  more » « less
Award ID(s):
1854683 1854703
PAR ID:
10336630
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Frontiers in Artificial Intelligence
Volume:
4
ISSN:
2624-8212
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Aichholzer, Oswin; Wang, Haitao (Ed.)
    Given a zigzag filtration, we want to find its barcode representatives, i.e., a compatible choice of bases for the homology groups that diagonalize the linear maps in the zigzag. To achieve this, we convert the input zigzag to a levelset zigzag of a real-valued function. This function generates a Mayer-Vietoris pyramid of spaces, which generates an infinite strip of homology groups. We call the origins of indecomposable (diamond) summands of this strip their apexes and give an algorithm to find representative cycles in these apexes from ordinary persistence computation. The resulting representatives map back to the levelset zigzag and thus yield barcode representatives for the input zigzag. Our algorithm for lifting a p-dimensional cycle from ordinary persistence to an apex representative takes O(p ⋅ m log m) time. From this we can recover zigzag representatives in time O(log m + C), where C is the size of the output. 
    more » « less
  2. Persistent Homology is a computational method of data mining in the field of Topological Data Analysis. Large-scale data analysis with persistent homology is computationally expensive and memory intensive. The performance of persistent homology has been rigorously studied to optimize data encoding and intermediate data structures for high-performance computation. This paper provides an application-centric survey of the High-Performance Computation of Persistent Homology. Computational topology concepts are reviewed and detailed for a broad data science and engineering audience. 
    more » « less
  3. Kernel methods for solving partial differential equations work coordinate-free on the surface and yield high approximation rates for smooth solutions. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods for data fitting problems, but the efficient numerical solution of the ill-conditioned linear systems of equations arising from kernel- based Galerkin solutions to PDEs is a challenging problem which has not been addressed in the literature so far. In this article we apply the framework of the geometric multigrid method with a τ ≥ 2-cycle to scattered, quasi-uniform point clouds on the surface. We show that the resulting solver can be accelerated by using the Lagrange function decay and obtain satisfying convergence rates by a rigorous analysis. In particular, we show that the computational cost of the linear solver scales log-linear in the degrees of freedom. 
    more » « less
  4. null (Ed.)
    We consider the best subset selection problem in linear regression—that is, finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We are primarily concerned with alternatives to cross-validation methods that do not require data partitioning and involve a range of information criteria extensively studied in the statistical literature. We show that the problem of interest can be modeled using fractional mixed-integer optimization, which can be tackled by leveraging recent advances in modern optimization solvers. The proposed algorithms involve solving a sequence of mixed-integer quadratic optimization problems (or their convexifications) and can be implemented with off-the-shelf solvers. We report encouraging results in our computational experiments, with respect to both the optimization and statistical performance. Summary of Contribution: This paper considers feature selection problems with information criteria. We show that by adopting a fractional optimization perspective (a well-known field in nonlinear optimization and operations research), it is possible to leverage recent advances in mixed-integer quadratic optimization technology to tackle traditional statistical problems long considered intractable. We present extensive computational experiments, with both synthetic and real data, illustrating that the new fractional optimization approach is orders of magnitude faster than existing approaches in the literature. 
    more » « less
  5. We introduce harmonic persistent homology spaces for filtrations of finite simplicial complexes. As a result we can associate concrete subspaces of cycles to each bar of the barcode of the filtration. We prove stability of the harmonic persistent homology subspaces, as well as the subspaces associated to the bars of the barcodes, under small perturbations of functions defining them. We relate the notion of ``essential simplices'' introduced in an earlier work to identify simplices which play a significant role in the birth of a bar, with that of harmonic persistent homology. We prove that the harmonic representatives of simple bars maximizes the ``relative essential content'' amongst all representatives of the bar, where the relative essential content is the weight a particular cycle puts on the set of essential simplices. \footnote{An extended abstract of the paper appeared in the Proceedings of the IEEE Symposium on the Foundations of Computer Science, 2021.} 
    more » « less