skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Wang, Song"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The problem of few-shot graph classification targets at assigning class labels for graph samples, where only limited labeled graphs are provided for each class. To solve the problem brought by label scarcity, recent studies have proposed to adopt the prevalent few-shot learning framework to achieve fast adaptations to graph classes with limited labeled graphs. In particular, these studies typically propose to accumulate meta-knowledge across a large number of meta-training tasks, and then generalize such meta-knowledge to meta-test tasks sampled from a disjoint class set. Nevertheless, existing studies generally ignore the crucial task correlations among meta-training tasks and treat them independently. In fact, such task correlations can help promote the model generalization to meta-test tasks and result in better classification performance. On the other hand, it remains challenging to capture and utilize task correlations due to the complex components and interactions in meta-training tasks. To deal with this, we propose a novel few-shot graph classification framework FAITH to capture task correlations via learning a hierarchical task structure at different granularities. We further propose a task-specific classifier to incorporate the learned task correlations into the few-shot graph classification process. Moreover, we derive FAITH+, a variant of FAITH that can improve the sampling process for the hierarchical task structure. The extensive experiments on four prevalent graph datasets further demonstrate the superiority of FAITH and FAITH+ over other state-of-the-art baselines.

     
    more » « less
    Free, publicly-accessible full text available April 30, 2025
  2. Graph-structured data is ubiquitous among a plethora of real-world applications. However, as graph learning algorithms have been increasingly deployed to help decision-making, there has been rising societal concern in the bias these algorithms may exhibit. In certain high-stake decision-making scenarios, the decisions made may be life-changing for the involved individuals. Accordingly, abundant explorations have been made to mitigate the bias for graph learning algorithms in recent years. However, there still lacks a library to collectively consolidate existing debiasing techniques and help practitioners to easily perform bias mitigation for graph learning algorithms. In this paper, we present PyGDebias, an open-source Python library for bias mitigation in graph learning algorithms. As the first comprehensive library of its kind, PyGDebias covers 13 popular debiasing methods under common fairness notions together with 26 commonly used graph datasets. In addition, PyGDebias also comes with comprehensive performance benchmarks and well-documented API designs for both researchers and practitioners. To foster convenient accessibility, PyGDebias is released under a permissive BSD-license together with performance benchmarks, API documentation, and use examples at https://github.com/yushundong/PyGDebias. 
    more » « less
    Free, publicly-accessible full text available May 13, 2025
  3. The task of few-shot graph classification aims to assign class labels to graph samples, where only a limited number of labeled graphs are provided for each class. To deal with the problem brought about by label scarcity, recent works have focused on adopting the prevalent few-shot learning framework to ensure fast adaptations to classes with limited labeled graphs. In general, these studies propose to accumulate meta-knowledge across various base classes with sufficient labeled graphs, and then generalize such meta-knowledge to novel classes, which are disjoint from base classes and consist of limited labeled graphs. However, existing studies generally ignore the distinct distribution shifts between base classes and novel classes, leading to unsatisfactory adaptation performance. On the other hand, it remains challenging to address this issue due to the potential variance in distributions between classes. To tackle this problem, we propose a novel generative few-shot graph classification framework that can promote adaptation performance by generating adaptive structures for graphs in novel classes. Our framework incorporates a generative model to modify the graph structures for adaptation. We further conduct extensive experiments to validate the effectiveness of our framework. 
    more » « less
    Free, publicly-accessible full text available October 29, 2024
  4. Recently, there has been a growing interest in developing machine learning (ML) models that can promote fairness, i.e., eliminating biased predictions towards certain populations (e.g., individuals from a specific demographic group). Most existing works learn such models based on well-designed fairness constraints in optimization. Nevertheless, in many practical ML tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance. This is because existing fairness constraints are designed to restrict the prediction disparity among different sensitive groups, but with few samples, it becomes difficult to accurately measure the disparity, thus rendering ineffective fairness optimization. In this paper, we define the fairness-aware learning task with limited training samples as the fair few-shot learning problem. To deal with this problem, we devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks. To compensate for insufficient training samples, we propose an essential strategy to select and leverage an auxiliary set for each meta-test task. These auxiliary sets contain several labeled training samples that can enhance the model performance regarding fairness in meta-test tasks, thereby allowing for the transfer of learned useful fairness-oriented knowledge to meta-test tasks. Furthermore, we conduct extensive experiments on three real-world datasets to validate the superiority of our framework against the state-of-the-art baselines. 
    more » « less
    Free, publicly-accessible full text available September 30, 2024
  5. Federated Learning (FL) enables multiple clients to collaboratively learn a machine learning model without exchanging their own local data. In this way, the server can exploit the computational power of all clients and train the model on a larger set of data samples among all clients. Although such a mechanism is proven to be effective in various fields, existing works generally assume that each client preserves sufficient data for training. In practice, however, certain clients can only contain a limited number of samples (i.e., few-shot samples). For example, the available photo data taken by a specific user with a new mobile device is relatively rare. In this scenario, existing FL efforts typically encounter a significant performance drop on these clients. Therefore, it is urgent to develop a few-shot model that can generalize to clients with limited data under the FL scenario. In this paper, we refer to this novel problem as federated few-shot learning. Nevertheless, the problem remains challenging due to two major reasons: the global data variance among clients (i.e., the difference in data distributions among clients) and the local data insufficiency in each client (i.e., the lack of adequate local data for training). To overcome these two challenges, we propose a novel federated few-shot learning framework with two separately updated models and dedicated training strategies to reduce the adverse impact of global data variance and local data insufficiency. Extensive experiments on four prevalent datasets that cover news articles and images validate the effectiveness of our framework compared with the state-of-the-art baselines. 
    more » « less
    Free, publicly-accessible full text available August 4, 2024
  6. Graph Neural Networks (GNNs) have emerged as the leading paradigm for solving graph analytical problems in various real-world applications. Nevertheless, GNNs could potentially render biased predictions towards certain demographic subgroups. Understanding how the bias in predictions arises is critical, as it guides the design of GNN debiasing mechanisms. However, most existing works overwhelmingly focus on GNN debiasing, but fall short on explaining how such bias is induced. In this paper, we study a novel problem of interpreting GNN unfairness through attributing it to the influence of training nodes. Specifically, we propose a novel strategy named Probabilistic Distribution Disparity (PDD) to measure the bias exhibited in GNNs, and develop an algorithm to efficiently estimate the influence of each training node on such bias. We verify the validity of PDD and the effectiveness of influence estimation through experiments on real-world datasets. Finally, we also demonstrate how the proposed framework could be used for debiasing GNNs. Open-source code can be found at https://github.com/yushundong/BIND. 
    more » « less
    Free, publicly-accessible full text available June 27, 2024
  7. Few-shot node classification aims at classifying nodes with limited labeled nodes as references. Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i.e., meta-training classes) and then generalize to classes with limited labeled nodes (i.e., meta-test classes). Nevertheless, on real-world graphs, it is usually difficult to obtain abundant labeled nodes for many classes. In practice, each meta-training class can only consist of several labeled nodes, known as the extremely weak supervision problem. In few-shot node classification, with extremely limited labeled nodes for meta-training, the generalization gap between meta-training and meta-test will become larger and thus lead to suboptimal performance. To tackle this issue, we study a novel problem of few-shot node classification with extremely weak supervision and propose a principled framework X-FNC under the prevalent meta-learning framework. Specifically, our goal is to accumulate meta-knowledge across different meta-training tasks with extremely weak supervision and generalize such knowledge to meta-test tasks. To address the challenges resulting from extremely scarce labeled nodes, we propose two essential modules to obtain pseudo-labeled nodes as extra references and effectively learn from extremely limited supervision information. We further conduct extensive experiments on four node classification datasets with extremely weak supervision to validate the superiority of our framework compared to the state-of-the-art baselines. 
    more » « less
  8. Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PLMs) have achieved substantial advancements in the field of natural language processing. However, in real-world scenarios, data labels are often noisy due to the complex annotation process, making it essential to develop strategies for fine-tuning PLMs with such noisy labels. To this end, we introduce an innovative approach for fine-tuning PLMs using noisy labels, which incorporates the guidance of Large Language Models (LLMs) like ChatGPT. This guidance assists in accurately distinguishing between clean and noisy samples and provides supplementary information beyond the noisy labels, thereby boosting the learning process during fine-tuning PLMs. Extensive experiments on synthetic and real-world noisy datasets further demonstrate the superior advantages of our framework over the state-of-the-art baselines. 
    more » « less
  9. Graph mining algorithms have been playing a significant role in myriad fields over the years. However, despite their promising performance on various graph analytical tasks, most of these algorithms lack fairness considerations. As a consequence, they could lead to discrimination towards certain populations when exploited in human-centered applications. Recently, algorithmic fairness has been extensively studied in graph-based applications. In contrast to algorithmic fairness on independent and identically distributed (i.i.d.) data, fairness in graph mining has exclusive backgrounds, taxonomies, and fulfilling techniques. In this survey, we provide a comprehensive and up-to-date introduction of existing literature under the context of fair graph mining. Specifically, we propose a novel taxonomy of fairness notions on graphs, which sheds light on their connections and differences. We further present an organized summary of existing techniques that promote fairness in graph mining. Finally, we discuss current research challenges and open questions, aiming at encouraging cross-breeding ideas and further advances. 
    more » « less