skip to main content

Title: Active Sampling over Graphs for Bayesian Reconstruction with Gaussian Ensembles
Graph-guided semi-supervised learning (SSL) has gained popularity in several network science applications, including biological, social, and financial ones. SSL becomes particularly challenging when the available nodal labels are scarce, what motivates naturally the active learning (AL) paradigm. AL seeks the most informative nodes to label in order to effectively estimate the nodal values of unobserved nodes. It is also referred to as active sampling, and boils down to learning the sought function mapping, and an acquisition function (AF) to identify the next node(s) to sample. To learn the mapping, this work leverages an adaptive Bayesian model comprising an ensemble (E) of Gaussian Processes (GPs) with enhanced expressiveness of the function space. Unlike most alternatives, the EGP model relies only on the one-hop connectivity of each node. Capitalizing on this EGP model, a suite of novel and intuitive AFs are developed to guide the active sampling process. These AFs are then combined with weights that are adapted incrementally to further robustify performance. Numerical tests on real and synthetic datasets corroborate the merits of the novel methods.  more » « less
Award ID(s):
2128593 2103256 1901134 2212318
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Asilomar Conference on Signals Systems and Computers
Page Range / eLocation ID:
58 to 64
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Graph-guided semi-supervised learning (SSL) and inference has emerged as an attractive research field thanks to its documented impact in a gamut of application domains, including transportation and power networks, biological, social, environmental, and financial ones. Distinct from SSL approaches that yield point estimates of the variables to be inferred, the present work puts forth a Bayesian interval learning framework that utilizes Gaussian processes (GPs) to allow for uncertainty quantification – a key component in safety-critical applications. An ensemble (E) of GPs is employed to offer an expressive model of the learning function that is updated incrementally as nodal observations become available – what caters also for delay-sensitive settings. For the first time in graph-guided SSL and inference, egonet features per node are utilized as input to the EGP learning function to account for higher order interactions than the one-hop connectivity of each node. Further enhancing these attributes through random features that encrypt sensitive information per node offers scalability and privacy for the EGP-based learning approach. Numerical tests on real and synthetic datasets corroborate the effectiveness of the novel method. 
    more » « less
  2. Graph-guided learning has well-documented impact in a gamut of network science applications. A prototypical graph-guided learning task deals with semi-supervised learning over graphs, where the goal is to predict the nodal values or labels of unobserved nodes, by leveraging a few nodal observations along with the underlying graph structure. This is particularly challenging under privacy constraints or generally when acquiring nodal observations incurs high cost. In this context, the present work puts forth a Bayesian graph-driven self-supervised learning (Self-SL) approach that: (i) learns powerful nodal embeddings emanating from easier to solve auxiliary tasks that map local to global connectivity information; and, (ii) adopts an ensemble of Gaussian processes (EGPs) with adaptive weights as nodal embeddings are processed online. Unlike most existing deterministic approaches, the novel approach offers accurate estimates of the unobserved nodal values along with uncertainty quantification that is important especially in safety critical applications. Numerical tests on synthetic and real graph datasets showcase merits of the novel EGP-based Self-SL method. 
    more » « less
  3. Bayesian optimization (BO) has well-documented merits for optimizing black-box functions with an expensive evaluation cost. Such functions emerge in applications as diverse as hyperparameter tuning, drug discovery, and robotics. BO hinges on a Bayesian surrogate model to sequentially select query points so as to balance exploration with exploitation of the search space. Most existing works rely on a single Gaussian process (GP) based surrogate model, where the kernel function form is typically preselected using domain knowledge. To bypass such a design process, this paper leverages an ensemble (E) of GPs to adaptively select the surrogate model fit on-the-fly, yielding a GP mixture posterior with enhanced expressiveness for the sought function. Acquisition of the next evaluation input using this EGP-based function posterior is then enabled by Thompson sampling (TS) that requires no additional design parameters. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. The novel EGP-TS readily accommodates parallel operation. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret for both sequential and parallel settings. Tests on synthetic functions and real-world applications showcase the merits of the proposed method. 
    more » « less
  4. Most state-of-the-art Graph Neural Networks (GNNs) can be defined as a form of graph convolution which can be realized by message passing between direct neighbors or beyond. To scale such GNNs to large graphs, various neighbor-, layer-, or subgraph-sampling techniques are proposed to alleviate the "neighbor explosion" problem by considering only a small subset of messages passed to the nodes in a mini-batch. However, sampling-based methods are difficult to apply to GNNs that utilize many-hops-away or global context each layer, show unstable performance for different tasks and datasets, and do not speed up model inference. We propose a principled and fundamentally different approach, VQ-GNN, a universal framework to scale up any convolution-based GNNs using Vector Quantization (VQ) without compromising the performance. In contrast to sampling-based techniques, our approach can effectively preserve all the messages passed to a mini-batch of nodes by learning and updating a small number of quantized reference vectors of global node representations, using VQ within each GNN layer. Our framework avoids the "neighbor explosion" problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix. We show that such a compact low-rank version of the gigantic convolution matrix is sufficient both theoretically and experimentally. In company with VQ, we design a novel approximated message passing algorithm and a nontrivial back-propagation rule for our framework. Experiments on various types of GNN backbones demonstrate the scalability and competitive performance of our framework on large-graph node classification and link prediction benchmarks. 
    more » « less
  5. Graphs have emerged as one of the most important and powerful data structures to perform content analysis in many fields. In this line of work, node classification is a classic task, which is generally performed using graph neural networks (GNNs). Unfortunately, regular GNNs cannot be well generalized into the real-world application scenario when the labeled nodes are few. To address this challenge, we propose a novel few-shot node classification model that leverages pseudo-labeling with graph active learning. We first provide a theoretical analysis to argue that extra unlabeled data benefit few-shot classification. Inspired by this, our model proceeds by performing multi-level data augmentation with consistency and contrastive regularizations for better semi-supervised pseudo-labeling, and further devising graph active learning to facilitate pseudo-label selection and improve model effectiveness. Extensive experiments on four public citation networks have demonstrated that our model can effectively improve node classification accuracy with considerably few labeled data, which significantly outperforms all state-of-the-art baselines by large margins. 
    more » « less