skip to main content


Title: Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative
This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL). We focus on the following question: How to construct contrastive views for hypergraphs via augmentations? We provide the solutions in two folds. First, guided by domain knowledge, we fabricate two schemes to augment hyperedges with higher-order relations encoded, and adopt three vertex augmentation strategies from graph-structured data. Second, in search of more effective views in a data-driven manner, we for the first time propose a hypergraph generative model to generate augmented views, and then an end-to-end differentiable pipeline to jointly learn hypergraph augmentations and model parameters. Our technical innovations are reflected in designing both fabricated and generative augmentations of hypergraphs. The experimental findings include: (i) Among fabricated augmentations in HyperGCL, augmenting hyperedges provides the most numerical gains, implying that higher-order information in structures is usually more downstream-relevant; (ii) Generative augmentations do better in preserving higher-order information to further benefit generalizability; (iii) HyperGCL also boosts robustness and fairness in hypergraph representation learning. Codes are released at https://github.com/weitianxin/HyperGCL.  more » « less
Award ID(s):
1943008
NSF-PAR ID:
10419606
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Koyejo S.; Mohamed S.; Agarwal A.; Belgrave D.; Cho K.; Oh A.
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
35
ISSN:
1049-5258
Page Range / eLocation ID:
1909-1922
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the effect of structured higher-order interactions on the collective behavior of coupled phase oscillators. By combining a hypergraph generative model with dimensionality reduction techniques, we obtain a reduced system of differential equations for the system’s order parameters. We illustrate our framework with the example of a hypergraph with hyperedges of sizes 2 (links) and 3 (triangles). For this case, we obtain a set of two coupled nonlinear algebraic equations for the order parameters. For strong values of coupling via triangles, the system exhibits bistability and explosive synchronization transitions. We find conditions that lead to bistability in terms of hypergraph properties and validate our predictions with numerical simulations. Our results provide a general framework to study the synchronization of phase oscillators in hypergraphs, and they can be extended to hypergraphs with hyperedges of arbitrary sizes, dynamic-structural correlations, and other features. 
    more » « less
  2. Despite the prevalence of hypergraphs in a variety of high-impact applications, there are relatively few works on hypergraph representation learning, most of which primarily focus on hyperlink prediction, and are often restricted to the transductive learning setting. Among others, a major hurdle for effective hypergraph representation learning lies in the label scarcity of nodes and/or hyperedges. To address this issue, this paper presents an end-to-end, bi-level pre-training strategy with Graph Neural Networks for hypergraphs. The proposed framework named HyperGRL bears three distinctive advantages. First, it is mainly designed in the self-supervised fashion which has broad applicability, and meanwhile it is also capable of ingesting the labeling information when available. Second, at the heart of the proposed HyperGRL are two carefully designed pretexts, one on the node level and the other on the hyperedge level, which enable us to encode both the local and the global context in a mutually complementary way. Third, the proposed framework can work in both transductive and inductive settings. When applying the two proposed pretexts in tandem, it can accelerate the adaptation of the knowledge from the pre-trained model to downstream applications in the transductive setting, thanks to the bi-level nature of the proposed method. Extensive experiments demonstrate that: (1) HyperGRL achieves up to 5.69% improvements in hyperedge classification, and (2) improves pre-training efficiency by up to 42.80% on average 
    more » « less
  3. We consider the problem of space-efficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highly-studied problem of estimating the number of triangles in a graph stream. Our input is a k-uniform hypergraph H with n vertices and m hyperedges, each hyperedge being a k-sized subset of vertices. A k-simplex in H is a subhypergraph on k+1 vertices X such that all k+1 possible hyperedges among X exist in H. The goal is to process the hyperedges of H, which arrive in an arbitrary order as a data stream, and compute a good estimate of T_k(H), the number of k-simplices in H. We design a suite of algorithms for this problem. As with triangle-counting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T_k(H) ≥ T. Under such a promise, our algorithms use at most four passes and together imply a space bound of O(ε^{-2} log δ^{-1} polylog n ⋅ min{(m^{1+1/k})/T, m/(T^{2/(k+1)})}) for each fixed k ≥ 3, in order to guarantee an estimate within (1±ε)T_k(H) with probability ≥ 1-δ. We also give a simpler 1-pass algorithm that achieves O(ε^{-2} log δ^{-1} log n⋅ (m/T) (Δ_E + Δ_V^{1-1/k})) space, where Δ_E (respectively, Δ_V) denotes the maximum number of k-simplices that share a hyperedge (respectively, a vertex), which generalizes a previous result for the k = 2 case. We complement these algorithmic results with space lower bounds of the form Ω(ε^{-2}), Ω(m^{1+1/k}/T), Ω(m/T^{1-1/k}) and Ω(mΔ_V^{1/k}/T) for multi-pass algorithms and Ω(mΔ_E/T) for 1-pass algorithms, which show that some of the dependencies on parameters in our upper bounds are nearly tight. Our techniques extend and generalize several different ideas previously developed for triangle counting in graphs, using appropriate innovations to handle the more complicated combinatorics of hypergraphs. 
    more » « less
  4. null (Ed.)
    HYPERGRAPHS provide the formalism needed to solve problems consisting of interconnected item sets. Similar to a traditional graph, the hypergraph has the added generalization that “hyperedges” may connect any number of nodes. Domains such as very-large-scale integration for creating integrated circuits [1], machine learning [2], [3], [4], parallel algorithms [5], combinatorial scientific computing [6], and social network analysis [7], [8] all contain significant and challenging instances of hypergraph problems. One important problem, Hypergraph partitioning, involves dividing the nodes of a hypergraph among k similarly-sized disjoint sets while reducing the number of hyperedges that span multiple partitions. In the context of load balancing, this is the problem of dividing logical threads (nodes) that share data dependencies (hyperedges) among available machines (partitions) in order to balance the number of threads per machine and minimize communication overhead. However, hypergraph partitioning is both NP-Hard to solve [9] and approximate [10]. 
    more » « less
  5. We consider hypergraph visualization that represent vertices as points and hyperedges as lines with few bends passing through points of their incident vertices. Guided by point-line incidence theory we show several theoretical results: if every vertex is part of at most two hyperedges, then we can find such a visualization without bends. There exist hypergraphs with three vertices per hyperedge and three hyperedges incident to each vertex requiring an arbitrary number of bends. It is ETR-hard to decide whether an arbitrary hypergraph can be visualized without bends. This only answers some interesting questions for such visualizations and we conclude with many open research questions. 
    more » « less