skip to main content

This content will become publicly available on December 17, 2023

Title: iFiG: Individually Fair Multi-view Graph Clustering
In a connected world, fair graph learning is becoming increasingly important because of the growing concerns about bias. Yet, the vast majority of existing works assume that the input graph comes from a single view while ignoring the multi-view essence of graphs. Generally speaking, the bias in graph mining is often rooted in the input graph and is further introduced or even amplified by the graph mining model. It thus poses critical research questions regarding the intrinsic relationships of fairness on different views and the possibility of mitigating bias on multiple views simultaneously. To answer these questions, in this paper, we explore individual fairness in multi-view graph mining. We first demonstrate the necessity of fair multi-view graph learning. Building upon the optimization perspective of fair single-view graph mining, we then formulate our problem as a linear weighted optimization problem. In order to figure out the weight of each view, we resort to the minimax Pareto fairness, which is closely related to the Rawlsian difference principle, and propose an effective solver named iFiG that minimizes the utility loss while promoting individual fairness for each view with two different instantiations. The extensive experiments that we conduct in the application of multi-view spectral clustering and INFORM post-processing demonstrate the efficacy of our proposed method in individual bias mitigation.  more » « less
Award ID(s):
2134079 1939725 1947135
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2022 IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
329 to 338
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Algorithmic bias and fairness in the context of graph mining have largely remained nascent. The sparse literature on fair graph mining has almost exclusively focused on group-based fairness notation. However, the notion of individual fairness, which promises the fairness notion at a much finer granularity, has not been well studied. This paper presents the first principled study of Individual Fairness on gRaph Mining (InFoRM). First, we present a generic definition of individual fairness for graph mining which naturally leads to a quantitative measure of the potential bias in graph mining results. Second, we propose three mutually complementary algorithmic frameworks to mitigate the proposed individual bias measure, namely debiasing the input graph, debiasing the mining model and debiasing the mining results. Each algorithmic framework is formulated from the optimization perspective, using effective and efficient solvers, which are applicable to multiple graph mining tasks. Third, accommodating individual fairness is likely to change the original graph mining results without the fairness consideration. We conduct a thorough analysis to develop an upper bound to characterize the cost (i.e., the difference between the graph mining results with and without the fairness consideration). We perform extensive experimental evaluations on real-world datasets to demonstrate the efficacy and generality of the proposed methods. 
    more » « less
  2. Graph Neural Networks (GNNs) have shown great power in learning node representations on graphs. However, they may inherit historical prejudices from training data, leading to discriminatory bias in predictions. Although some work has developed fair GNNs, most of them directly borrow fair representation learning techniques from non-graph domains without considering the potential problem of sensitive attribute leakage caused by feature propagation in GNNs. However, we empirically observe that feature propagation could vary the correlation of previously innocuous non-sensitive features to the sensitive ones. This can be viewed as a leakage of sensitive information which could further exacerbate discrimination in predictions. Thus, we design two feature masking strategies according to feature correlations to highlight the importance of considering feature propagation and correlation variation in alleviating discrimination. Motivated by our analysis, we propose Fair View Graph Neural Network (FairVGNN) to generate fair views of features by automatically identifying and masking sensitive-correlated features considering correlation variation after feature propagation. Given the learned fair views, we adaptively clamp weights of the encoder to avoid using sensitive-related features. Experiments on real-world datasets demonstrate that FairVGNN enjoys a better trade-off between model utility and fairness. 
    more » « less
  3. As machine learning becomes more widely adopted across domains, it is critical that researchers and ML engineers think about the inherent biases in the data that may be perpetuated by the model. Recently, many studies have shown that such biases are also imbibed in Graph Neural Network (GNN) models if the input graph is biased, potentially to the disadvantage of underserved and underrepresented communities. In this work, we aim to mitigate the bias learned by GNNs by jointly optimizing two different loss functions: one for the task of link prediction and one for the task of demographic parity. We further implement three different techniques inspired by graph modification approaches: the Global Fairness Optimization (GFO), Constrained Fairness Optimization (CFO), and Fair Edge Weighting (FEW) models. These techniques mimic the effects of changing underlying graph structures within the GNN and offer a greater degree of interpretability over more integrated neural network methods. Our proposed models emulate microscopic or macroscopic edits to the input graph while training GNNs and learn node embeddings that are both accurate and fair under the context of link recommendations. We demonstrate the effectiveness of our approach on four real world datasets and show that we can improve the recommendation fairness by several factors at negligible cost to link prediction accuracy. 
    more » « less
  4. null (Ed.)
    Graph mining is an essential component of recommender systems and search engines. Outputs of graph mining models typically provide a ranked list sorted by each item's relevance or utility. However, recent research has identified issues of algorithmic bias in such models, and new graph mining algorithms have been proposed to correct for bias. As such, algorithm developers need tools that can help them uncover potential biases in their models while also exploring the impacts of correcting for biases when employing fairness-aware algorithms. In this paper, we present FairRankVis, a visual analytics framework designed to enable the exploration of multi-class bias in graph mining algorithms. We support both group and individual fairness levels of comparison. Our framework is designed to enable model developers to compare multi-class fairness between algorithms (for example, comparing PageRank with a debiased PageRank algorithm) to assess the impacts of algorithmic debiasing with respect to group and individual fairness. We demonstrate our framework through two usage scenarios inspecting algorithmic fairness. 
    more » « less
  5. In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that enjoys strong empirical performance and inherits the mathematical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted group-lasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data view that are best for determining the groups, often leading to improved integrative clustering. To solve our problem, we develop a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data. 
    more » « less