skip to main content


Title: InFoRM: Individual Fairness on Graph Mining
Algorithmic bias and fairness in the context of graph mining have largely remained nascent. The sparse literature on fair graph mining has almost exclusively focused on group-based fairness notation. However, the notion of individual fairness, which promises the fairness notion at a much finer granularity, has not been well studied. This paper presents the first principled study of Individual Fairness on gRaph Mining (InFoRM). First, we present a generic definition of individual fairness for graph mining which naturally leads to a quantitative measure of the potential bias in graph mining results. Second, we propose three mutually complementary algorithmic frameworks to mitigate the proposed individual bias measure, namely debiasing the input graph, debiasing the mining model and debiasing the mining results. Each algorithmic framework is formulated from the optimization perspective, using effective and efficient solvers, which are applicable to multiple graph mining tasks. Third, accommodating individual fairness is likely to change the original graph mining results without the fairness consideration. We conduct a thorough analysis to develop an upper bound to characterize the cost (i.e., the difference between the graph mining results with and without the fairness consideration). We perform extensive experimental evaluations on real-world datasets to demonstrate the efficacy and generality of the proposed methods.  more » « less
Award ID(s):
1939725 1715385 1947135
NSF-PAR ID:
10200242
Author(s) / Creator(s):
Date Published:
Journal Name:
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Graph mining is an essential component of recommender systems and search engines. Outputs of graph mining models typically provide a ranked list sorted by each item's relevance or utility. However, recent research has identified issues of algorithmic bias in such models, and new graph mining algorithms have been proposed to correct for bias. As such, algorithm developers need tools that can help them uncover potential biases in their models while also exploring the impacts of correcting for biases when employing fairness-aware algorithms. In this paper, we present FairRankVis, a visual analytics framework designed to enable the exploration of multi-class bias in graph mining algorithms. We support both group and individual fairness levels of comparison. Our framework is designed to enable model developers to compare multi-class fairness between algorithms (for example, comparing PageRank with a debiased PageRank algorithm) to assess the impacts of algorithmic debiasing with respect to group and individual fairness. We demonstrate our framework through two usage scenarios inspecting algorithmic fairness. 
    more » « less
  2. Algorithmic fairness is becoming increasingly important in data mining and machine learning. Among others, a foundational notation is group fairness. The vast majority of the existing works on group fairness, with a few exceptions, primarily focus on debiasing with respect to a single sensitive attribute, despite the fact that the co-existence of multiple sensitive attributes (e.g., gender, race, marital status, etc.) in the real-world is commonplace. As such, methods that can ensure a fair learning outcome with respect to all sensitive attributes of concern simultaneously need to be developed. In this paper, we study the problem of information-theoretic intersectional fairness (InfoFair), where statistical parity, a representative group fairness measure, is guaranteed among demographic groups formed by multiple sensitive attributes of interest. We formulate it as a mutual information minimization problem and propose a generic end-to-end algorithmic framework to solve it. The key idea is to leverage a variational representation of mutual information, which considers the variational distribution between learning outcomes and sensitive attributes, as well as the density ratio between the variational and the original distributions. Our proposed framework is generalizable to many different settings, including other statistical notions of fairness, and could handle any type of learning task equipped with a gradientbased optimizer. Empirical evaluations in the fair classification task on three real-world datasets demonstrate that our proposed framework can effectively debias the classification results with minimal impact to the classification accuracy. 
    more » « less
  3. In a connected world, fair graph learning is becoming increasingly important because of the growing concerns about bias. Yet, the vast majority of existing works assume that the input graph comes from a single view while ignoring the multi-view essence of graphs. Generally speaking, the bias in graph mining is often rooted in the input graph and is further introduced or even amplified by the graph mining model. It thus poses critical research questions regarding the intrinsic relationships of fairness on different views and the possibility of mitigating bias on multiple views simultaneously. To answer these questions, in this paper, we explore individual fairness in multi-view graph mining. We first demonstrate the necessity of fair multi-view graph learning. Building upon the optimization perspective of fair single-view graph mining, we then formulate our problem as a linear weighted optimization problem. In order to figure out the weight of each view, we resort to the minimax Pareto fairness, which is closely related to the Rawlsian difference principle, and propose an effective solver named iFiG that minimizes the utility loss while promoting individual fairness for each view with two different instantiations. The extensive experiments that we conduct in the application of multi-view spectral clustering and INFORM post-processing demonstrate the efficacy of our proposed method in individual bias mitigation. 
    more » « less
  4. Graph is a ubiquitous type of data that appears in many real-world applications, including social network analysis, recommendations and financial security. Important as it is, decades of research have developed plentiful computational models to mine graphs. Despite its prosperity, concerns with respect to the potential algorithmic discrimination have been grown recently. Algorithmic fairness on graphs, which aims to mitigate bias introduced or amplified during the graph mining process, is an attractive yet challenging research topic. The first challenge corresponds to the theoretical challenge, where the non-IID nature of graph data may not only invalidate the basic assumption behind many existing studies in fair machine learning, but also introduce new fairness definition(s) based on the inter-correlation between nodes rather than the existing fairness definition(s) in fair machine learning. The second challenge regarding its algorithmic aspect aims to understand how to balance the trade-off between model accuracy and fairness. This tutorial aims to (1) comprehensively review the state-of-the-art techniques to enforce algorithmic fairness on graphs and (2) enlighten the open challenges and future directions. We believe this tutorial could benefit researchers and practitioners from the areas of data mining, artificial intelligence and social science. 
    more » « less
  5. Social media have become a prevalent channel to access information, spread ideas, and influence opinions. However, it has been suggested that social and algorithmic filtering may cause exposure to less diverse points of view. Here we quantitatively measure this kind of social bias at the collective level by mining a massive datasets of web clicks. Our analysis shows that collectively, people access information from a significantly narrower spectrum of sources through social media and email, compared to a search baseline. The significance of this finding for individual exposure is revealed by investigating the relationship between the diversity of information sources experienced by users at both the collective and individual levels in two datasets where individual users can be analyzed—Twitter posts and search logs. There is a strong correlation between collective and individual diversity, supporting the notion that when we use social media we find ourselves inside “social bubbles.” Our results could lead to a deeper understanding of how technology biases our exposure to new information.

     
    more » « less