skip to main content


Title: Self-supervised Hypergraph Representation Learning
Despite the prevalence of hypergraphs in a variety of high-impact applications, there are relatively few works on hypergraph representation learning, most of which primarily focus on hyperlink prediction, and are often restricted to the transductive learning setting. Among others, a major hurdle for effective hypergraph representation learning lies in the label scarcity of nodes and/or hyperedges. To address this issue, this paper presents an end-to-end, bi-level pre-training strategy with Graph Neural Networks for hypergraphs. The proposed framework named HyperGRL bears three distinctive advantages. First, it is mainly designed in the self-supervised fashion which has broad applicability, and meanwhile it is also capable of ingesting the labeling information when available. Second, at the heart of the proposed HyperGRL are two carefully designed pretexts, one on the node level and the other on the hyperedge level, which enable us to encode both the local and the global context in a mutually complementary way. Third, the proposed framework can work in both transductive and inductive settings. When applying the two proposed pretexts in tandem, it can accelerate the adaptation of the knowledge from the pre-trained model to downstream applications in the transductive setting, thanks to the bi-level nature of the proposed method. Extensive experiments demonstrate that: (1) HyperGRL achieves up to 5.69% improvements in hyperedge classification, and (2) improves pre-training efficiency by up to 42.80% on average  more » « less
Award ID(s):
2134079 1939725 1947135
NSF-PAR ID:
10428929
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2022 IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
505 to 514
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Computing a dense subgraph is a fundamental problem in graph mining, with a diverse set of applications ranging from electronic commerce to community detection in social networks. In many of these applications, the underlying context is better modelled as a weighted hypergraph that keeps evolving with time. This motivates the problem of maintaining the densest subhypergraph of a weighted hypergraph in a dynamic setting, where the input keeps changing via a sequence of updates (hyperedge insertions/deletions). Previously, the only known algorithm for this problem was due to Hu et al. [19]. This algorithm worked only on unweighted hypergraphs, and had an approximation ratio of (1 +ϵ)r2 and an update time of O(poly(r, log n)), where r denotes the maximum rank of the input across all the updates. We obtain a new algorithm for this problem, which works even when the input hypergraph is weighted. Our algorithm has a significantly improved (near-optimal) approximation ratio of (1 +ϵ) that is independent of r, and a similar update time of O(poly(r, log n)). It is the first (1 +ϵ)-approximation algorithm even for the special case of weighted simple graphs. To complement our theoretical analysis, we perform experiments with our dynamic algorithm on large-scale, real-world data-sets. Our algorithm significantly outperforms the state of the art [19] both in terms of accuracy and efficiency. 
    more » « less
  2. We study p -Laplacians and spectral clustering for a recently proposed hypergraph model that incorporates edge-dependent vertex weights (EDVW). These weights can reflect different importance of vertices within a hyperedge, thus conferring the hypergraph model higher expressivity and flexibility. By constructing submodular EDVW-based splitting functions, we convert hypergraphs with EDVW into submodular hypergraphs for which the spectral theory is better developed. In this way, existing concepts and theorems such as p -Laplacians and Cheeger inequalities proposed under the submodular hypergraph setting can be directly extended to hypergraphs with EDVW. For submodular hypergraphs with EDVW-based splitting functions, we propose an efficient algorithm to compute the eigenvector associated with the second smallest eigenvalue of the hypergraph 1-Laplacian. We then utilize this eigenvector to cluster the vertices, achieving higher clustering accuracy than traditional spectral clustering based on the 2-Laplacian. More broadly, the proposed algorithm works for all submodular hypergraphs that are graph reducible. Numerical experiments using real-world data demonstrate the effectiveness of combining spectral clustering based on the 1-Laplacian and EDVW. 
    more » « less
  3. null (Ed.)
    Web tracking and advertising (WTA) nowadays are ubiquitously performed on the web, continuously compromising users' privacy. Existing defense solutions, such as widely deployed blocking tools based on filter lists and alternative machine learning based solutions proposed in prior research, have limitations in terms of accuracy and effectiveness. In this work, we propose WtaGraph, a web tracking and advertising detection framework based on Graph Neural Networks (GNNs). We first construct an attributed homogenous multi-graph (AHMG) that represents HTTP network traffic, and formulate web tracking and advertising detection as a task of GNN-based edge representation learning and classification in AHMG. We then design four components in WtaGraph so that it can (1) collect HTTP network traffic, DOM, and JavaScript data, (2) construct AHMG and extract corresponding edge and node features, (3) build a GNN model for edge representation learning and WTA detection in the transductive learning setting, and (4) use a pre-trained GNN model for WTA detection in the inductive learning setting. We evaluate WtaGraph on a dataset collected from Alexa Top 10K websites, and show that WtaGraph can effectively detect WTA requests in both transductive and inductive learning settings. Manual verification results indicate that WtaGraph can detect new WTA requests that are missed by filter lists and recognize non-WTA requests that are mistakenly labeled by filter lists. Our ablation analysis, evasion evaluation, and real-time evaluation show that WtaGraph can have a competitive performance with flexible deployment options in practice. 
    more » « less
  4. We introduce finite-function-encoding (FFE) states which encode arbitrary d -valued logic functions, i.e., multivariate functions over the ring of integers modulo d , and investigate some of their structural properties. We also point out some differences between polynomial and non-polynomial function encoding states: The former can be associated to graphical objects, that we dub tensor-edge hypergraphs (TEH), which are a generalization of hypergraphs with a tensor attached to each hyperedge encoding the coefficients of the different monomials. To complete the framework, we also introduce a notion of finite-function-encoding Pauli (FP) operators, which correspond to elements of what is known as the generalized symmetric group in mathematics. First, using this machinery, we study the stabilizer group associated to FFE states and observe how qudit hypergraph states introduced in Ref. \cite{2017PhRvA..95e2340S} admit stabilizers of a particularly simpler form. Afterwards, we investigate the classification of FFE states under local unitaries (LU), and, after showing the complexity of this problem, we focus on the case of bipartite states and especially on the classification under local FP operations (LFP). We find all LU and LFP classes for two qutrits and two ququarts and study several other special classes, pointing out the relation between maximally entangled FFE states and complex Butson-type Hadamard matrices. Our investigation showcases also the relation between the properties of FFE states, especially their LU classification, and the theory of finite rings over the integers. 
    more » « less
  5. Koyejo S. ; Mohamed S. ; Agarwal A. ; Belgrave D. ; Cho K. ; Oh A. (Ed.)
    This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL). We focus on the following question: How to construct contrastive views for hypergraphs via augmentations? We provide the solutions in two folds. First, guided by domain knowledge, we fabricate two schemes to augment hyperedges with higher-order relations encoded, and adopt three vertex augmentation strategies from graph-structured data. Second, in search of more effective views in a data-driven manner, we for the first time propose a hypergraph generative model to generate augmented views, and then an end-to-end differentiable pipeline to jointly learn hypergraph augmentations and model parameters. Our technical innovations are reflected in designing both fabricated and generative augmentations of hypergraphs. The experimental findings include: (i) Among fabricated augmentations in HyperGCL, augmenting hyperedges provides the most numerical gains, implying that higher-order information in structures is usually more downstream-relevant; (ii) Generative augmentations do better in preserving higher-order information to further benefit generalizability; (iii) HyperGCL also boosts robustness and fairness in hypergraph representation learning. Codes are released at https://github.com/weitianxin/HyperGCL. 
    more » « less