skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Computing Optimal Persistent Cycles: Polynomial and Hard Cases
Persistent cycles, especially the minimal ones, are useful geometric features functioning as augmentations for the intervals in the purely topological persistence diagrams (also termed as barcodes). In our earlier work, we showed that computing minimal 1-dimensional persistent cycles (persistent 1-cycles) for finite intervals is NP-hard while the same for infinite intervals is polynomially tractable. In this paper, we address this problem for general dimensions with Z2 coefficients. In addition to proving that it is NP-hard to compute minimal persistent d-cycles (d>1) for both types of intervals given arbitrary simplicial complexes, we identify two interesting cases which are polynomially tractable. These two cases assume the complex to be a certain generalization of manifolds which we term as weak pseudomanifolds. For finite intervals from the d-th persistence diagram of a weak (d+1)-pseudomanifold, we utilize the fact that persistent cycles of such intervals are null-homologous and reduce the problem to a minimal cut problem. Since the same problem for infinite intervals is NP-hard, we further assume the weak (d+1)-pseudomanifold to be embedded in R^{d+1}Rd+1 so that the complex has a natural dual graph structure and the problem reduces to a minimal cut problem. Experiments with both algorithms on scientific data indicate that the minimal persistent cycles capture various significant features of the data.  more » « less
Award ID(s):
1839252
PAR ID:
10182670
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM-SIAM Symposium on Discrete Algorithms
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of finding the minimal-size factorization of the provenance of self-join-free conjunctive queries, i.e.,we want to find a formula that minimizes the number of variable repetitions. This problem is equivalent to solving the fundamental Boolean formula factorization problem for the restricted setting of the provenance formulas of self-join free queries. While general Boolean formula minimization is Σp2-complete, we show that the problem is NP-Complete in our case. Additionally, we identify a large class of queries that can be solved in PTIME, expanding beyond the previously known tractable cases of read-once formulas and hierarchical queries. We describe connections between factorizations, Variable Elimination Orders (VEOs), and minimal query plans. We leverage these insights to create an Integer Linear Program (ILP) that can solve the minimal factorization problem exactly. We also propose a Max-Flow Min-Cut (MFMC) based algorithm that gives an efficient approximate solution. Importantly, we show that both the Linear Programming (LP) relaxation of our ILP, and our MFMC-based algorithm are always correct for all currently known PTIME cases. Thus, we present two unified algorithms (ILP and MFMC) that can both recover all known PTIME cases in PTIME, yet also solve NP-Complete cases either exactly (ILP) or approximately (MFMC), as desired. 
    more » « less
  2. We study the identity testing problem in the context of spin systems or undirected graphical models, where it takes the following form: given the parameter specification of the model M and a sampling oracle for the distribution \mu_{M^*} of an unknown model M^*, can we efficiently determine if the two models M and M^* are the same? We consider identity testing for both soft-constraint and hard-constraint systems. In particular, we prove hardness results in two prototypical cases, the Ising model and proper colorings, and explore whether identity testing is any easier than structure learning. For the ferromagnetic (attractive) Ising model, Daskalasis et al. (2018) presented a polynomial time algorithm for identity testing. We prove hardness results in the antiferromagnetic (repulsive) setting in the same regime of parameters where structure learning is known to require a super-polynomial number of samples. In particular, for n-vertex graphs of maximum degree d, we prove that if |\beta| d = \omega(\log n) (where \beta is the inverse temperature parameter), then there is no identity testing algorithm for the antiferromagnetic Ising model that runs in polynomial time unless RP = NP. We also establish computational lower bounds for a broader set of parameters under the (randomized) exponential time hypothesis. In our proofs, we use random graphs as gadgets; this is inspired by similar constructions in seminal works on the hardness of approximate counting. In the hard-constraint setting, we present hardness results for identity testing for proper colorings. Our results are based on the presumed hardness of #BIS, the problem of (approximately) counting independent sets in bipartite graphs. In particular, we prove that identity testing for colorings is hard in the same range of parameters where structure learning is known to be hard, which in turn matches the parameter regime for NP-hardness of the corresponding decision problem. 
    more » « less
  3. We analyze the run-time complexity of computing allocations that are both fair and maximize the utilitarian social welfare, defined as the sum of agents’ utilities. We focus on two tractable fairness concepts: envy-freeness up to one item (EF1) and proportionality up to one item (PROP1). We consider two computational problems: (1) Among the utilitarian-maximal allocations, decide whether there exists one that is also fair; (2) among the fair allocations, compute one that maximizes the utilitarian welfare. We show that both problems are strongly NP-hard when the number of agents is variable, and remain NP-hard for a fixed number of agents greater than two. For the special case of two agents, we find that problem (1) is polynomial-time solvable, while problem (2) remains NP-hard. Finally, with a fixed number of agents, we design pseudopolynomial-time algorithms for both problems. We extend our results to the stronger fairness notions envy-freeness up to any item (EFx) and proportionality up to any item (PROPx). 
    more » « less
  4. null (Ed.)
    Learning task-specific representations of persistence diagrams is an important problem in topological data analysis and machine learning. However, current state of the art methods are restricted in terms of their expressivity as they are focused on Euclidean representations. Persistence diagrams often contain features of infinite persistence (i.e., essential features) and Euclidean spaces shrink their importance relative to non-essential features because they cannot assign infinite distance to finite points. To deal with this issue, we propose a method to learn representations of persistence diagrams on hyperbolic spaces, more specifically on the Poincare ball. By representing features of infinite persistence infinitesimally close to the boundary of the ball, their distance to non-essential features approaches infinity, thereby their relative importance is preserved. This is achieved without utilizing extremely high values for the learnable parameters, thus the representation can be fed into downstream optimization methods and trained efficiently in an end-to-end fashion. We present experimental results on graph and image classification tasks and show that the performance of our method is on par with or exceeds the performance of other state of the art methods. 
    more » « less
  5. In the Maximum Independent Set problem we are asked to find a set of pairwise nonadjacent vertices in a given graph with the maximum possible cardinality. In general graphs, this classical problem is known to be NP-hard and hard to approximate within a factor of n1−ε for any ε > 0. Due to this, investigating the complexity of Maximum Independent Set in various graph classes in hope of finding better tractability results is an active research direction. In H-free graphs, that is, graphs not containing a fixed graph H as an induced subgraph, the problem is known to remain NP-hard and APX-hard whenever H contains a cycle, a vertex of degree at least four, or two vertices of degree at least three in one connected component. For the remaining cases, where every component of H is a path or a subdivided claw, the complexity of Maximum Independent Set remains widely open, with only a handful of polynomial-time solvability results for small graphs H such as P5, P6, the claw, or the fork. We prove that for every such “possibly tractable” graph H there exists an algorithm that, given an H-free graph G and an accuracy parameter ε > 0, finds an independent set in G of cardinality within a factor of (1 – ε) of the optimum in time exponential in a polynomial of log | V(G) | and ε−1. That is, we show that for every graph H for which Maximum Independent Set is not known to be APX-hard in H-free graphs, the problem admits a quasi-polynomial time approximation scheme in this graph class. Our algorithm works also in the more general weighted setting, where the input graph is supplied with a weight function on vertices and we are maximizing the total weight of an independent set. 
    more » « less