skip to main content


Title: On Fault-Tolerant Low-Diameter Clusters in Graphs
Cliques and their generalizations are frequently used to model “tightly knit” clusters in graphs and identifying such clusters is a popular technique used in graph-based data mining. One such model is the s-club, which is a vertex subset that induces a subgraph of diameter at most s. This model has found use in a variety of fields because low-diameter clusters have practical significance in many applications. As this property is not hereditary on vertex-induced subgraphs, the diameter of a subgraph could increase upon the removal of some vertices and the subgraph could even become disconnected. For example, star graphs have diameter two but can be disconnected by removing the central vertex. The pursuit of a fault-tolerant extension of the s-club model has spawned two variants that we study in this article: robust s-clubs and hereditary s-clubs. We analyze the complexity of the verification and optimization problems associated with these variants. Then, we propose cut-like integer programming formulations for both variants whenever possible and investigate the separation complexity of the cut-like constraints. We demonstrate through our extensive computational experiments that the algorithmic ideas we introduce enable us to solve the problems to optimality on benchmark instances with several thousand vertices. This work lays the foundations for effective mathematical programming approaches for finding fault-tolerant s-clubs in large-scale networks. History: Accepted by David Alderson, Area Editor for Network Optimization: Algorithms & Applications. Funding: The computing for this project was performed at the High Performance Computing Center at Oklahoma State University supported in part through the National Science Foundation [Grant OAC-1531128]. This material is based upon work supported by the National Science Foundation under [Grants 1662757 and 1942065]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/ijoc.2022.1231 .  more » « less
Award ID(s):
1942065
NSF-PAR ID:
10404895
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
INFORMS Journal on Computing
Volume:
34
Issue:
6
ISSN:
1091-9856
Page Range / eLocation ID:
3181 to 3199
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The Sparsest Cut is a fundamental optimization problem that have been extensively studied. For planar inputs the problem is in P and can be solved in Õ(n 3 ) time if all vertex weights are 1. Despite a significant amount of effort, the best algorithms date back to the early 90’s and can only achieve O(log n)-approximation in Õ(n) time or 3.5-approximation in Õ(n 2 ) time [Rao, STOC92]. Our main result is an Ω(n 2−ε ) lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the (min, +)-Convolution conjecture, showing that approxima- tions are inevitable in the near-linear time regime. To complement the lower bound, we provide a 3.3-approximation in near-linear time, improving upon the 25-year old result of Rao in both time and accuracy. We also show that our lower bound is not far from optimal by observing an exact algorithm with running time Õ(n 5/2 ) improving upon the Õ(n 3 ) algorithm of Park and Phillips [STOC93]. Our lower bound accomplishes a repeatedly raised challenge by being the first fine-grained lower bound for a natural planar graph problem in P. Building on our construction we prove near-quadratic lower bounds under SETH for variants of the closest pair problem in planar graphs, and use them to show that the popular Average-Linkage procedure for Hierarchical Clustering cannot be simulated in truly subquadratic time. At the core of our constructions is a diamond-like gadget that also settles the complexity of Diameter in distributed planar networks. We prove an Ω(n/ log n) lower bound on the number of communication rounds required to compute the weighted diameter of a network in the CONGET model, even when the underlying graph is planar and all nodes are D = 4 hops away from each other. This is the first poly(n) lower bound in the planar-distributed setting, and it complements the recent poly(D, log n) upper bounds of Li and Parter [STOC 2019] for (exact) unweighted diameter and for (1 + ε) approximate weighted diameter. 
    more » « less
  2. null (Ed.)
    A graph G is called {\em self-ordered} (a.k.a asymmetric) if the identity permutation is its only automorphism. Equivalently, there is a unique isomorphism from G to any graph that is isomorphic to G. We say that G=(VE) is {\em robustly self-ordered}if the size of the symmetric difference between E and the edge-set of the graph obtained by permuting V using any permutation :VV is proportional to the number of non-fixed-points of . In this work, we initiate the study of the structure, construction and utility of robustly self-ordered graphs. We show that robustly self-ordered bounded-degree graphs exist (in abundance), and that they can be constructed efficiently, in a strong sense. Specifically, given the index of a vertex in such a graph, it is possible to find all its neighbors in polynomial-time (i.e., in time that is poly-logarithmic in the size of the graph). We provide two very different constructions, in tools and structure. The first, a direct construction, is based on proving a sufficient condition for robust self-ordering, which requires that an auxiliary graph, on {\em pairs} of vertices of the original graph, is expanding. In this case the original graph is (not only robustly self-ordered but) also expanding. The second construction proceeds in three steps: It boosts the mere existence of robustly self-ordered graphs, which provides explicit graphs of sublogarithmic size, to an efficient construction of polynomial-size graphs, and then, repeating it again, to exponential-size(robustly self-ordered) graphs that are locally constructible. This construction can yield robustly self-ordered graphs that are either expanders or highly disconnected, having logarithmic size connected components. We also consider graphs of unbounded degree, seeking correspondingly unbounded robustness parameters. We again demonstrate that such graphs (of linear degree)exist (in abundance), and that they can be constructed efficiently, in a strong sense. This turns out to require very different tools. Specifically, we show that the construction of such graphs reduces to the construction of non-malleable two-source extractors with very weak parameters but with some additional natural features. We actually show two reductions, one simpler than the other but yielding a less efficient construction when combined with the known constructions of extractors. We demonstrate that robustly self-ordered bounded-degree graphs are useful towards obtaining lower bounds on the query complexity of testing graph properties both in the bounded-degree and the dense graph models. Indeed, their robustness offers efficient, local and distance preserving reductions from testing problems on ordered structures (like sequences) to the unordered (effectively unlabeled) graphs. One of the results that we obtain, via such a reduction, is a subexponential separation between the query complexities of testing and tolerant testing of graph properties in the bounded-degree graph model. Changes to previous version: We retract the claims made in our initial posting regarding the construction of non-malleable two-source extractors (which are quasi-orthogonal) as well as the claims about the construction of relocation-detecting codes (see Theorems 1.5 and 1.6 in the original version). The source of trouble is a fundamental flaw in the proof of Lemma 9.7 (in the original version), which may as well be wrong. Hence, the original Section 9 was omitted, except that the original Section 9.3 was retained as a new Section 8.3. The original Section 8 appears as Section 8.0 and 8.1, and Section 8.2 is new. 
    more » « less
  3. The separability of clusters is one of the most desired properties in clustering. There is a wide range of settings in which different clusterings of the same data set appear. We are interested in applications for which there is a need for an explicit, gradual transition of one separable clustering into another one. This transition should be a sequence of simple, natural steps that upholds separability of the clusters throughout. We design an algorithm for such a transition. We exploit the intimate connection of separability and linear programming over bounded-shape partition and transportation polytopes: separable clusterings lie on the boundary of partition polytopes and form a subset of the vertices of the corresponding transportation polytopes, and circuits of both polytopes are readily interpreted as sequential or cyclical exchanges of items between clusters. This allows for a natural approach to achieve the desired transition through a combination of two walks: an edge walk between two so-called radial clusterings in a transportation polytope, computed through an adaptation of classical tools of sensitivity analysis and parametric programming, and a walk from a separable clustering to a corresponding radial clustering, computed through a tailored, iterative routine updating cluster sizes and reoptimizing the cluster assignment of items. Funding: Borgwardt gratefully acknowledges support of this work through National Science Foundation [Grant 2006183] Circuit Walks in Optimization, Algorithmic Foundations, Division of Computing and Communication Foundations; through Air Force Office of Scientific Research [Grant FA9550-21-1-0233] The Hirsch Conjecture for Totally-Unimodular Polyhedra; and through Simons Collaboration [Grant 524210] Polyhedral Theory in Data Analytics. Happach has been supported by the Alexander von Humboldt Foundation with funds from the German Federal Ministry of Education and Research. 
    more » « less
  4. A _theta_ is a graph consisting of two non-adjacent vertices and three internally disjoint paths between them, each of length at least two. For a family $\mathcal{H}$ of graphs, we say a graph $G$ is $\mathcal{H}$-_free_ if no induced subgraph of $G$ is isomorphic to a member of $\mathcal{H}$. We prove a conjecture of Sintiari and Trotignon, that there exists an absolute constant $c$ for which every (theta, triangle)-free graph $G$ has treewidth at most $c\log (|V(G)|)$. A construction by Sintiari and Trotignon shows that this bound is asymptotically best possible, and (theta, triangle)-free graphs comprise the first known hereditary class of graphs with arbitrarily large yet logarithmic treewidth.Our main result is in fact a generalization of the above conjecture, that treewidth is at most logarithmic in $|V(G)|$ for every graph $G$ excluding the so-called _three-path-configurations_ as well as a fixed complete graph. It follows that several NP-hard problems such as Stable Set, Vertex Cover, Dominating Set and $k$-Coloring (for fixed $k$) admit polynomial time algorithms in graphs excluding the three-path-configurations and a fixed complete graph. 
    more » « less
  5. Statistics of small subgraph counts such as triangles, four-cycles, and s-t paths of short lengths reveal important structural properties of the underlying graph. These problems have been widely studied in social network analysis. In most relevant applications, the graphs are not only massive but also change dynamically over time. Most of these problems become hard in the dynamic setting when considering the worst case. In this paper, we ask whether the question of small subgraph counting over dynamic graphs is hard also in the average case. We consider the simplest possible average case model where the updates follow an Erdős-Rényi graph: each update selects a pair of vertices (u, v) uniformly at random and flips the existence of the edge (u, v). We develop new lower bounds and matching algorithms in this model for counting four-cycles, counting triangles through a specified point s, or a random queried point, and st paths of length 3, 4 and 5. Our results indicate while computing st paths of length 3, and 4 are easy in the average case with O(1) update time (note that they are hard in the worst case), it becomes hard when considering st paths of length 5. We introduce new techniques which allow us to get average-case hardness for these graph problems from the worst-case hardness of the Online Matrix vector problem (OMv). Our techniques rely on recent advances in fine-grained average-case complexity. Our techniques advance this literature, giving the ability to prove new lower bounds on average-case dynamic algorithms. Read More: https://epubs.siam.org/doi/abs/10.1137/1.9781611977073.23 
    more » « less