Matrix reduction is the standard procedure for computing the persistent homology of a filtered simplicial complex with
Computing Optimal Persistent Cycles: Polynomial and Hard Cases
Persistent cycles, especially the minimal ones, are useful geometric features functioning as augmentations for the intervals in the purely topological persistence diagrams (also termed as barcodes). In our earlier work, we showed that computing minimal 1dimensional persistent cycles (persistent 1cycles) for finite intervals is NPhard while the same for infinite intervals is polynomially tractable. In this paper, we address this problem for general dimensions with Z2 coefficients. In addition to proving that it is NPhard to compute minimal persistent dcycles (d>1) for both types of intervals given arbitrary simplicial complexes, we identify two interesting cases which are polynomially tractable. These two cases assume the complex to be a certain generalization of manifolds which we term as weak pseudomanifolds. For finite intervals from the dth persistence diagram of a weak (d+1)pseudomanifold, we utilize the fact that persistent cycles of such intervals are nullhomologous and reduce the problem to a minimal cut problem. Since the same problem for infinite intervals is NPhard, we further assume the weak (d+1)pseudomanifold to be embedded in R^{d+1}Rd+1 so that the complex has a natural dual graph structure and the problem reduces to a minimal cut problem. Experiments with both algorithms on scientific data indicate that the minimal persistent cycles capture various significant features of the data.
more »
« less
 Award ID(s):
 1839252
 NSFPAR ID:
 10182670
 Date Published:
 Journal Name:
 ACMSIAM Symposium on Discrete Algorithms
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Abstract m simplices. Its output is a particular decomposition of the total boundary matrix, from which the persistence diagrams and generating cycles are derived. Persistence diagrams are known to vary continuously with respect to their input, motivating the study of their computation for timevarying filtered complexes. Computing persistence dynamically can be reduced to maintaining a valid decomposition under adjacent transpositions in the filtration order. Since there are such transpositions, this maintenance procedure exhibits limited scalability and is often too fine for many applications. We propose a coarser strategy for maintaining the decomposition over a 1parameter family of filtrations. By reduction to a particular longest common subsequence problem, we show that the minimal number of decomposition updates$$O(m^2)$$ $O\left({m}^{2}\right)$d can be found in time and$$O(m \log \log m)$$ $O(mloglogm)$O (m ) space, and that the corresponding sequence of permutations—which we call aschedule —can be constructed in time. We also show that, in expectation, the storage needed to employ this strategy is actually sublinear in$$O(d m \log m)$$ $O(dmlogm)$m . Exploiting this connection, we show experimentally that the decrease in operations to compute diagrams across a family of filtrations is proportional to the difference between the expected quadratic number of states and the proposed sublinear coarsening. Applications to video data, dynamic metric space data, and multiparameter persistence are also presented. 
Goaoc, Xavier ; Kerber, Michael (Ed.)We consider the following surveillance problem: Given a set P of n sites in a metric space and a set R of k robots with the same maximum speed, compute a patrol schedule of minimum latency for the robots. Here a patrol schedule specifies for each robot an infinite sequence of sites to visit (in the given order) and the latency L of a schedule is the maximum latency of any site, where the latency of a site s is the supremum of the lengths of the time intervals between consecutive visits to s. When k = 1 the problem is equivalent to the travelling salesman problem (TSP) and thus it is NPhard. For k ≥ 2 (which is the version we are interested in) the problem becomes even more challenging; for example, it is not even clear if the decision version of the problem is decidable, in particular in the Euclidean case. We have two main results. We consider cyclic solutions in which the set of sites must be partitioned into 𝓁 groups, for some 𝓁 ≤ k, and each group is assigned a subset of the robots that move along the travelling salesman tour of the group at equal distance from each other. Our first main result is that approximating the optimal latency of the class of cyclic solutions can be reduced to approximating the optimal travelling salesman tour on some input, with only a 1+ε factor loss in the approximation factor and an O((k/ε) ^k) factor loss in the runtime, for any ε > 0. Our second main result shows that an optimal cyclic solution is a 2(11/k)approximation of the overall optimal solution. Note that for k = 2 this implies that an optimal cyclic solution is optimal overall. We conjecture that this is true for k ≥ 3 as well. The results have a number of consequences. For the Euclidean version of the problem, for instance, combining our results with known results on Euclidean TSP, yields a PTAS for approximating an optimal cyclic solution, and it yields a (2(11/k)+ε)approximation of the optimal unrestricted (not necessarily cyclic) solution. If the conjecture mentioned above is true, then our algorithm is actually a PTAS for the general problem in the Euclidean setting. Similar results can be obtained by combining our results with other known TSP algorithms in nonEuclidean metrics.more » « less

We study the identity testing problem in the context of spin systems or undirected graphical models, where it takes the following form: given the parameter specification of the model M and a sampling oracle for the distribution \mu_{M^*} of an unknown model M^*, can we efficiently determine if the two models M and M^* are the same? We consider identity testing for both softconstraint and hardconstraint systems. In particular, we prove hardness results in two prototypical cases, the Ising model and proper colorings, and explore whether identity testing is any easier than structure learning. For the ferromagnetic (attractive) Ising model, Daskalasis et al. (2018) presented a polynomial time algorithm for identity testing. We prove hardness results in the antiferromagnetic (repulsive) setting in the same regime of parameters where structure learning is known to require a superpolynomial number of samples. In particular, for nvertex graphs of maximum degree d, we prove that if \beta d = \omega(\log n) (where \beta is the inverse temperature parameter), then there is no identity testing algorithm for the antiferromagnetic Ising model that runs in polynomial time unless RP = NP. We also establish computational lower bounds for a broader set of parameters under the (randomized) exponential time hypothesis. In our proofs, we use random graphs as gadgets; this is inspired by similar constructions in seminal works on the hardness of approximate counting. In the hardconstraint setting, we present hardness results for identity testing for proper colorings. Our results are based on the presumed hardness of #BIS, the problem of (approximately) counting independent sets in bipartite graphs. In particular, we prove that identity testing for colorings is hard in the same range of parameters where structure learning is known to be hard, which in turn matches the parameter regime for NPhardness of the corresponding decision problem.more » « less

The Minimum Circuit Size Problem (MCSP) has been the focus of intense study recently; MCSP is hard for SZK under rather powerful reductions, and is provably not hard under "local" reductions computable in TIME(n^0.49) . The question of whether MCSP is NPhard (or indeed, hard even for small subclasses of P) under some of the more familiar notions of reducibility (such as manyone or Turing reductions computable in polynomial time or in AC^0) is closely related to many of the longstanding open questions in complexity theory. All prior hardness results for MCSP hold also for computing somewhat weak approximations to the circuit complexity of a function. Some of these results were proved by exploiting a connection to a notion of timebounded Kolmogorov complexity (KT) and the corresponding decision problem (MKTP). More recently, a new approach for proving improved hardness results for MKTP was developed, but this approach establishes only hardness of extremely good approximations of the form 1+o(1), and these improved hardness results are not yet known to hold for MCSP. In particular, it is known that MKTP is hard for the complexity class DET under nonuniform AC^0 mreductions, implying MKTP is not in AC^0[p] for any prime p. It was still open if similar circuit lower bounds hold for MCSP. One possible avenue for proving a similar hardness result for MCSP would be to improve the hardness of approximation for MKTP beyond 1 + o(1) to omega(1), as KTcomplexity and circuit size are polynomiallyrelated. In this paper, we show that this approach cannot succeed. More specically, we prove that PARITY does not reduce to the problem of computing superlinear approximations to KTcomplexity or circuit size via AC^0Turing reductions that make O(1) queries. This is signicant, since approximating any set in P/poly AC^0reduces to just one query of a much worse approximation of circuit size or KTcomplexity. For weaker approximations, we also prove nonhardness under more powerful reductions. Our nonhardness results are unconditional, in contrast to conditional results presented in earlier work (for more powerful reductions, but for much worse approximations). This highlights obstacles that would have to be overcome by any proof that MKTP or MCSP is hard for NP under AC^0 reductions. It may also be a step toward conrming a conjecture of Murray and Williams, that MCSP is not NPcomplete under logtimeuniform AC0 mreductions.more » « less

INTRODUCTION Solving quantum manybody problems, such as finding ground states of quantum systems, has farreaching consequences for physics, materials science, and chemistry. Classical computers have facilitated many profound advances in science and technology, but they often struggle to solve such problems. Scalable, faulttolerant quantum computers will be able to solve a broad array of quantum problems but are unlikely to be available for years to come. Meanwhile, how can we best exploit our powerful classical computers to advance our understanding of complex quantum systems? Recently, classical machine learning (ML) techniques have been adapted to investigate problems in quantum manybody physics. So far, these approaches are mostly heuristic, reflecting the general paucity of rigorous theory in ML. Although they have been shown to be effective in some intermediatesize experiments, these methods are generally not backed by convincing theoretical arguments to ensure good performance. RATIONALE A central question is whether classical ML algorithms can provably outperform nonML algorithms in challenging quantum manybody problems. We provide a concrete answer by devising and analyzing classical ML algorithms for predicting the properties of ground states of quantum systems. We prove that these ML algorithms can efficiently and accurately predict groundstate properties of gapped local Hamiltonians, after learning from data obtained by measuring other ground states in the same quantum phase of matter. Furthermore, under a widely accepted complexitytheoretic conjecture, we prove that no efficient classical algorithm that does not learn from data can achieve the same prediction guarantee. By generalizing from experimental data, ML algorithms can solve quantum manybody problems that could not be solved efficiently without access to experimental data. RESULTS We consider a family of gapped local quantum Hamiltonians, where the Hamiltonian H ( x ) depends smoothly on m parameters (denoted by x ). The ML algorithm learns from a set of training data consisting of sampled values of x , each accompanied by a classical representation of the ground state of H ( x ). These training data could be obtained from either classical simulations or quantum experiments. During the prediction phase, the ML algorithm predicts a classical representation of ground states for Hamiltonians different from those in the training data; groundstate properties can then be estimated using the predicted classical representation. Specifically, our classical ML algorithm predicts expectation values of products of local observables in the ground state, with a small error when averaged over the value of x . The run time of the algorithm and the amount of training data required both scale polynomially in m and linearly in the size of the quantum system. Our proof of this result builds on recent developments in quantum information theory, computational learning theory, and condensed matter theory. Furthermore, under the widely accepted conjecture that nondeterministic polynomialtime (NP)–complete problems cannot be solved in randomized polynomial time, we prove that no polynomialtime classical algorithm that does not learn from data can match the prediction performance achieved by the ML algorithm. In a related contribution using similar proof techniques, we show that classical ML algorithms can efficiently learn how to classify quantum phases of matter. In this scenario, the training data consist of classical representations of quantum states, where each state carries a label indicating whether it belongs to phase A or phase B . The ML algorithm then predicts the phase label for quantum states that were not encountered during training. The classical ML algorithm not only classifies phases accurately, but also constructs an explicit classifying function. Numerical experiments verify that our proposed ML algorithms work well in a variety of scenarios, including Rydberg atom systems, twodimensional random Heisenberg models, symmetryprotected topological phases, and topologically ordered phases. CONCLUSION We have rigorously established that classical ML algorithms, informed by data collected in physical experiments, can effectively address some quantum manybody problems. These rigorous results boost our hopes that classical ML trained on experimental data can solve practical problems in chemistry and materials science that would be too hard to solve using classical processing alone. Our arguments build on the concept of a succinct classical representation of quantum states derived from randomized Pauli measurements. Although some quantum devices lack the local control needed to perform such measurements, we expect that other classical representations could be exploited by classical ML with similarly powerful results. How can we make use of accessible measurement data to predict properties reliably? Answering such questions will expand the reach of nearterm quantum platforms. Classical algorithms for quantum manybody problems. Classical ML algorithms learn from training data, obtained from either classical simulations or quantum experiments. Then, the ML algorithm produces a classical representation for the ground state of a physical system that was not encountered during training. Classical algorithms that do not learn from data may require substantially longer computation time to achieve the same task.more » « less