skip to main content

Title: Hypergraphs with edge-dependent vertex weights: p-Laplacians and spectral clustering
We study p -Laplacians and spectral clustering for a recently proposed hypergraph model that incorporates edge-dependent vertex weights (EDVW). These weights can reflect different importance of vertices within a hyperedge, thus conferring the hypergraph model higher expressivity and flexibility. By constructing submodular EDVW-based splitting functions, we convert hypergraphs with EDVW into submodular hypergraphs for which the spectral theory is better developed. In this way, existing concepts and theorems such as p -Laplacians and Cheeger inequalities proposed under the submodular hypergraph setting can be directly extended to hypergraphs with EDVW. For submodular hypergraphs with EDVW-based splitting functions, we propose an efficient algorithm to compute the eigenvector associated with the second smallest eigenvalue of the hypergraph 1-Laplacian. We then utilize this eigenvector to cluster the vertices, achieving higher clustering accuracy than traditional spectral clustering based on the 2-Laplacian. More broadly, the proposed algorithm works for all submodular hypergraphs that are graph reducible. Numerical experiments using real-world data demonstrate the effectiveness of combining spectral clustering based on the 1-Laplacian and EDVW.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
Frontiers in Big Data
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We develop a framework for incorporating edge-dependent vertex weights (EDVWs) into the hypergraph minimum s - t cut problem. These weights are able to reflect different importance of vertices within a hyperedge, thus leading to better characterized cut properties. More precisely, we introduce a new class of hyperedge splitting functions that we call EDVWs-based, where the penalty of splitting a hyperedge depends only on the sum of EDVWs associated with the vertices on each side of the split. Moreover, we provide a way to construct submodular EDVWs-based splitting functions and prove that a hypergraph equipped with such splitting functions can be reduced to a graph sharing the same cut properties. In this case, the hypergraph minimum s - t cut problem can be solved using well-developed solutions to the graph minimum s - t cut problem. In addition, we show that an existing sparsification technique can be easily extended to our case and makes the reduced graph smaller and sparser, thus further accelerating the algorithms applied to the reduced graph. Numerical experiments using real-world data demonstrate the effectiveness of our proposed EDVWs-based splitting functions in comparison with the all-or-nothing splitting function and cardinality-based splitting functions commonly adopted in existing work. 
    more » « less
  2. Summary

    This paper presents a method for determining the relevant buses for reduced models of power grid networks described by systems of differential‐algebraic equations and for constructing the coarse‐grain dynamical power grid systems. To determine these buses, time integration of differential equations is not needed, but rather, a stationary system is analyzed. However, unlike stationary‐system approaches that determine only coarse generator buses by approximating the coherency of the generators, the proposed method analyzes the graph Laplacian associated with the admittance matrix. The buses for the reduced model are chosen to ensure that the graph Laplacian of the reduced model is an accurate approximation to the graph Laplacian of the full system. Both load and generator buses can be selected by this procedure since the Laplacian is defined on all the buses. The basis of this proposed approach lies in the close relationship between the synchrony of the system and the spectral properties of this Laplacian, that is, conditions on the spectrum of this Laplacian that almost surely guarantee the synchrony of the system. Thus, assuming that the full system is in synchrony, our strategy is to coarsen the full‐system Laplacian such that the coarse Laplacian possesses good approximation to these spectral conditions. Accurate approximation to these conditions then can better lead to synchronous reduced models. The coarsened Laplacian is defined on coarse degrees of freedom (DOFs), which are associated with the relevant buses to include in the reduced model. To realize this coarse DOF selection, we use multigrid coarsening techniques based on compatible relaxation. Multigrid is the natural choice since it has been extensively used to coarsen Laplacians arising from discretizations of elliptic partial differential equations and is actively being extended to graph Laplacians. With the selection of the buses for the reduced model, the reduced model is completed by constructing the coarse admittance matrix values and other physical parameters using standard power grid techniques or by using the intergrid operators constructed in the coarse DOFs selection process. Unfortunately, the selection of the coarse buses and the coarsening of the admittance matrix and physical parameters are not sufficient by themselves to produce a stable reduced system. To achieve a stable system, system structures of the fine‐grain model must be preserved in the reduced model. We analyze this to develop a multigrid methodology for constructing stable reduced models of power grid systems. Numerical examples are presented to validate this methodology.

    more » « less
  3. One of the most intruguing conjectures in extremal graph theory is the conjecture of Erdős and Sós from 1962, which asserts that every $n$-vertex graph with more than $\frac{k-1}{2}n$ edges contains any $k$-edge tree as a subgraph. Kalai proposed a generalization of this conjecture to hypergraphs. To explain the generalization, we need to define the concept of a tight tree in an $r$-uniform hypergraph, i.e., a hypergraph where each edge contains $r$ vertices. A tight tree is an $r$-uniform hypergraph such that there is an ordering $v_1,\ldots,v_n$ of its its vertices with the following property: the vertices $v_1,\ldots,v_r$ form an edge and for every $i>r$, there is a single edge $e$ containing the vertex $v_i$ and $r-1$ of the vertices $v_1,\ldots,v_{i-1}$, and $e\setminus\{v_i\}$ is a subset of one of the edges consisting only of vertices from $v_1,\ldots,v_{i-1}$. The conjecture of Kalai asserts that every $n$-vertex $r$-uniform hypergraph with more than $\frac{k-1}{r}\binom{n}{r-1}$ edges contains every $k$-edge tight tree as a subhypergraph. The recent breakthrough results on the existence of combinatorial designs by Keevash and by Glock, Kühn, Lo and Osthus show that this conjecture, if true, would be tight for infinitely many values of $n$ for every $r$ and $k$.The article deals with the special case of the conjecture when the sought tight tree is a path, i.e., the edges are the $r$-tuples of consecutive vertices in the above ordering. The case $r=2$ is the famous Erdős-Gallai theorem on the existence of paths in graphs. The case $r=3$ and $k=4$ follows from an earlier work of the authors on the conjecture of Kalai. The main result of the article is the first non-trivial upper bound valid for all $r$ and $k$. The proof is based on techniques developed for a closely related problem where a hypergraph comes with a geometric structure: the vertices are points in the plane in a strictly convex position and the sought path has to zigzag beetwen the vertices. 
    more » « less
  4. We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k-uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k-sized subset of vertices. A k-simplex in H is a subhypergraph on k+1 vertices X such that all k+1 possible hyperedges among X exist in H. The goal is to process the hyperedges of H, which arrive in an arbitrary order as a data stream, and compute a good estimate of T_k(H), the number of k-simplices in H. We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T_k(H) ≥ T. Under such a promise, our algorithms use at most four passes and together imply a space bound of O(ε^{-2} log δ^{-1} polylog n ⋅ min{(m^{1+1/k})/T, m/(T^{2/(k+1)})}) for each fixed k ≥ 3, in order to guarantee an estimate within (1±ε)T_k(H) with probability ≥ 1-δ. We also give a simpler 1-pass algorithm that achieves O(ε^{-2} log δ^{-1} log n⋅ (m/T) (Δ_E + Δ_V^{1-1/k})) space, where Δ_E (respectively, Δ_V) denotes the maximum number of k-simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω(ε^{-2}), Ω(m^{1+1/k}/T), Ω(m/T^{1-1/k}) and Ω(mΔ_V^{1/k}/T) for multi-pass algorithms and Ω(mΔ_E/T) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs. 
    more » « less
  5. Abstract The $p$-tensor Ising model is a one-parameter discrete exponential family for modeling dependent binary data, where the sufficient statistic is a multi-linear form of degree $p \geqslant 2$. This is a natural generalization of the matrix Ising model that provides a convenient mathematical framework for capturing, not just pairwise, but higher-order dependencies in complex relational data. In this paper, we consider the problem of estimating the natural parameter of the $p$-tensor Ising model given a single sample from the distribution on $N$ nodes. Our estimate is based on the maximum pseudolikelihood (MPL) method, which provides a computationally efficient algorithm for estimating the parameter that avoids computing the intractable partition function. We derive general conditions under which the MPL estimate is $\sqrt N$-consistent, that is, it converges to the true parameter at rate $1/\sqrt N$. Our conditions are robust enough to handle a variety of commonly used tensor Ising models, including spin glass models with random interactions and models where the rate of estimation undergoes a phase transition. In particular, this includes results on $\sqrt N$-consistency of the MPL estimate in the well-known $p$-spin Sherrington–Kirkpatrick model, spin systems on general $p$-uniform hypergraphs and Ising models on the hypergraph stochastic block model (HSBM). In fact, for the HSBM we pin down the exact location of the phase transition threshold, which is determined by the positivity of a certain mean-field variational problem, such that above this threshold the MPL estimate is $\sqrt N$-consistent, whereas below the threshold no estimator is consistent. Finally, we derive the precise fluctuations of the MPL estimate in the special case of the $p$-tensor Curie–Weiss model, which is the Ising model on the complete $p$-uniform hypergraph. An interesting consequence of our results is that the MPL estimate in the Curie–Weiss model saturates the Cramer–Rao lower bound at all points above the estimation threshold, that is, the MPL estimate incurs no loss in asymptotic statistical efficiency in the estimability regime, even though it is obtained by minimizing only an approximation of the true likelihood function for computational tractability. 
    more » « less