skip to main content

Title: FairRankVis: A Visual Analytics Framework for Exploring Algorithmic Fairness in Graph Mining Models
Graph mining is an essential component of recommender systems and search engines. Outputs of graph mining models typically provide a ranked list sorted by each item's relevance or utility. However, recent research has identified issues of algorithmic bias in such models, and new graph mining algorithms have been proposed to correct for bias. As such, algorithm developers need tools that can help them uncover potential biases in their models while also exploring the impacts of correcting for biases when employing fairness-aware algorithms. In this paper, we present FairRankVis, a visual analytics framework designed to enable the exploration of multi-class bias in graph mining algorithms. We support both group and individual fairness levels of comparison. Our framework is designed to enable model developers to compare multi-class fairness between algorithms (for example, comparing PageRank with a debiased PageRank algorithm) to assess the impacts of algorithmic debiasing with respect to group and individual fairness. We demonstrate our framework through two usage scenarios inspecting algorithmic fairness.
Authors:
; ; ; ;
Award ID(s):
1939725
Publication Date:
NSF-PAR ID:
10298775
Journal Name:
IEEE Transactions on Visualization and Computer Graphics
Page Range or eLocation-ID:
1 to 1
ISSN:
1077-2626
Sponsoring Org:
National Science Foundation
More Like this
  1. Algorithmic bias and fairness in the context of graph mining have largely remained nascent. The sparse literature on fair graph mining has almost exclusively focused on group-based fairness notation. However, the notion of individual fairness, which promises the fairness notion at a much finer granularity, has not been well studied. This paper presents the first principled study of Individual Fairness on gRaph Mining (InFoRM). First, we present a generic definition of individual fairness for graph mining which naturally leads to a quantitative measure of the potential bias in graph mining results. Second, we propose three mutually complementary algorithmic frameworks to mitigate the proposed individual bias measure, namely debiasing the input graph, debiasing the mining model and debiasing the mining results. Each algorithmic framework is formulated from the optimization perspective, using effective and efficient solvers, which are applicable to multiple graph mining tasks. Third, accommodating individual fairness is likely to change the original graph mining results without the fairness consideration. We conduct a thorough analysis to develop an upper bound to characterize the cost (i.e., the difference between the graph mining results with and without the fairness consideration). We perform extensive experimental evaluations on real-world datasets to demonstrate the efficacy andmore »generality of the proposed methods.« less
  2. The increasing impact of algorithmic decisions on people’s lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly color-blind algorithms can have on different groups. Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policy making, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary data set, such as the U.S. census, to construct models that predict the protected class from proxy variables, such as surname and geolocation. We show that even with such data, a variety of common disparity measures are generally unidentifiable, providing a new perspective on the documented biases of popular proxy-based methods. We provide exact characterizations of the tightest possible set of all possible true disparities that are consistent with the data (and possibly additional assumptions). We further provide optimization-based algorithms for computing and visualizing these sets and statistical tools to assess sampling uncertainty. Together, these enable reliable and robust assessmentsmore »of disparities—an important tool when disparity assessment can have far-reaching policy implications. We demonstrate this in two case studies with real data: mortgage lending and personalized medicine dosing. This paper was accepted by Hamid Nazerzadeh, Guest Editor for the Special Issue on Data-Driven Prescriptive Analytics.« less
  3. In today’s increasingly connected world, graph mining plays a piv- otal role in many real-world application domains, including social network analysis, recommendations, marketing and financial secu- rity. Tremendous efforts have been made to develop a wide range of computational models. However, recent studies have revealed that many widely-applied graph mining models could suffer from potential discrimination. Fairness on graph mining aims to develop strategies in order to mitigate bias introduced/amplified during the mining process. The unique challenges of enforcing fairness on graph mining include (1) theoretical challenge on non-IID nature of graph data, which may invalidate the basic assumption behind many existing studies in fair machine learning, and (2) algorith- mic challenge on the dilemma of balancing model accuracy and fairness. This tutorial aims to (1) present a comprehensive review of state-of-the-art techniques in fairness on graph mining and (2) identify the open challenges and future trends. In particular, we start with reviewing the background, problem definitions, unique challenges and related problems; then we will focus on an in-depth overview of (1) recent techniques in enforcing group fairness, indi- vidual fairness and other fairness notions in the context of graph mining, and (2) future directions in studying algorithmic fairness onmore »graphs. We believe this tutorial could be attractive to researchers and practitioners in areas including data mining, artificial intel- ligence, social science and beneficial to a plethora of real-world application domains.« less
  4. The idealization of a static machine-learned model, trained once and deployed forever, is not practical. As input distributions change over time, the model will not only lose accuracy, any constraints to reduce bias against a protected class may fail to work as intended. Thus, researchers have begun to explore ways to maintain algorithmic fairness over time. One line of work focuses on dynamic learning: retraining after each batch, and the other on robust learning which tries to make algorithms robust against all possible future changes. Dynamic learning seeks to reduce biases soon after they have occurred and robust learning often yields (overly) conservative models. We propose an anticipatory dynamic learning approach for correcting the algorithm to mitigate bias before it occurs. Specifically, we make use of anticipations regarding the relative distributions of population subgroups (e.g., relative ratios of male and female applicants) in the next cycle to identify the right parameters for an importance weighing fairness approach. Results from experiments over multiple real-world datasets suggest that this approach has promise for anticipatory bias correction.
  5. As algorithms increasingly inform and influence decisions made about individuals, it becomes increasingly important to address concerns that these algorithms might be discriminatory. The output of an algorithm can be discriminatory for many reasons, most notably: (1) the data used to train the algorithm might be biased (in various ways) to favor certain populations over others; (2) the analysis of this training data might inadvertently or maliciously introduce biases that are not borne out in the data. This work focuses on the latter concern. We develop and study multicalbration -- a new measure of algorithmic fairness that aims to mitigate concerns about discrimination that is introduced in the process of learning a predictor from data. Multicalibration guarantees accurate (calibrated) predictions for every subpopulation that can be identified within a specified class of computations. We think of the class as being quite rich; in particular, it can contain many overlapping subgroups of a protected group. We show that in many settings this strong notion of protection from discrimination is both attainable and aligned with the goal of obtaining accurate predictions. Along the way, we present new algorithms for learning a multicalibrated predictor, study the computational complexity of this task, and drawmore »new connections to computational learning models such as agnostic learning.« less