skip to main content

Title: Achieving Causal Fairness through Generative Adversarial Networks

Achieving fairness in learning models is currently an imperative task in machine learning. Meanwhile, recent research showed that fairness should be studied from the causal perspective, and proposed a number of fairness criteria based on Pearl's causal modeling framework. In this paper, we investigate the problem of building causal fairness-aware generative adversarial networks (CFGAN), which can learn a close distribution from a given dataset, while also ensuring various causal fairness criteria based on a given causal graph. CFGAN adopts two generators, whose structures are purposefully designed to reflect the structures of causal graph and interventional graph. Therefore, the two generators can respectively simulate the underlying causal model that generates the real data, as well as the causal model after the intervention. On the other hand, two discriminators are used for producing a close-to-real distribution, as well as for achieving various fairness criteria based on causal quantities simulated by generators. Experiments on a real-world dataset show that CFGAN can generate high quality fair data.

; ; ; ;
Award ID(s):
1646654 1564250 1841119
Publication Date:
Journal Name:
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Page Range or eLocation-ID:
1452 to 1458
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent years have witnessed the pivotal role of Graph Neural Networks (GNNs) in various high-stake decision-making scenarios due to their superior learning capability. Close on the heels of the successful adoption of GNNs in different application domains has been the increasing societal concern that conventional GNNs often do not have fairness considerations. Although some research progress has been made to improve the fairness of GNNs, these works mainly focus on the notion of group fairness regarding different subgroups defined by a protected attribute such as gender, age, and race. Beyond that, it is also essential to study the GNN fairness at a much finer granularity (i.e., at the node level) to ensure that GNNs render similar prediction results for similar individuals to achieve the notion of individual fairness. Toward this goal, in this paper, we make an initial investigation to enhance the individual fairness of GNNs and propose a novel ranking based framework---REDRESS. Specifically, we refine the notion of individual fairness from a ranking perspective, and formulate the ranking based individual fairness promotion problem. This naturally addresses the issue of Lipschitz constant specification and distance calibration resulted from the Lipschitz condition in the conventional individual fairness definition. Our proposed frameworkmore »REDRESS encapsulates the GNN model utility maximization and the ranking-based individual fairness promotion in a joint framework to enable end-to-end training. It is noteworthy mentioning that REDRESS is a plug-and-play framework and can be easily generalized to any prevalent GNN architectures. Extensive experiments on multiple real-world graphs demonstrate the superiority of REDRESS in achieving a good balance between model utility maximization and individual fairness promotion. Our open source code can be found here:« less
  2. We study fairness in supervised few-shot meta-learning models that are sensitive to discrimination (or bias) in historical data. A machine learning model trained based on biased data tends to make unfair predictions for users from minority groups. Although this problem has been studied before, existing methods mainly aim to detect and control the dependency effect of the protected variables (e.g. race, gender) on target prediction based on a large amount of training data. These approaches carry two major drawbacks that (1) lacking showing a global cause-effect visualization for all variables; (2) lacking generalization of both accuracy and fairness to unseen tasks. In this work, we first discover discrimination from data using a causal Bayesian knowledge graph which not only demonstrates the dependency of the protected variable on target but also indicates causal effects between all variables. Next, we develop a novel algorithm based on risk difference in order to quantify the discriminatory influence for each protected variable in the graph. Furthermore, to protect prediction from unfairness, a the fast-adapted bias-control approach in meta-learning is proposed, which efficiently mitigates statistical disparity for each task and it thus ensures independence of protected attributes on predictions based on biased and few-shot data samples.more »Distinct from existing meta-learning models, group unfairness of tasks are efficiently reduced by leveraging the mean difference between (un)protected groups for regression problems. Through extensive experiments on both synthetic and real-world data sets, we demonstrate that our proposed unfairness discovery and prevention approaches efficiently detect discrimination and mitigate biases on model output as well as generalize both accuracy and fairness to unseen tasks with a limited amount of training samples.« less
  3. Predictive models learned from historical data are widely used to help companies and organizations make decisions. However, they may digitally unfairly treat unwanted groups, raising concerns about fairness and discrimination. In this paper, we study the fairness-aware ranking problem which aims to discover discrimination in ranked datasets and reconstruct the fair ranking. Existing methods in fairness-aware ranking are mainly based on statistical parity that cannot measure the true discriminatory effect since discrimination is causal. On the other hand, existing methods in causal-based anti-discrimination learning focus on classification problems and cannot be directly applied to handle the ranked data. To address these limitations, we propose to map the rank position to a continuous score variable that represents the qualification of the candidates. Then, we build a causal graph that consists of both the discrete profile attributes and the continuous score. The path-specific effect technique is extended to the mixed-variable causal graph to identify both direct and indirect discrimination. The relationship between the path-specific effects for the ranked data and those for the binary decision is theoretically analyzed. Finally, algorithms for discovering and removing discrimination from a ranked dataset are developed. Experiments using the real-world dataset show the effectiveness of our approaches.
  4. Fair machine learning aims to mitigate the biases of model predictions against certain subpopulations regarding sensitive attributes such as race and gender. Among the many existing fairness notions, counterfactual fairness measures the model fairness from a causal perspective by comparing the predictions of each individual from the original data and the counterfactuals. In counterfactuals, the sensitive attribute values of this individual had been modified. Recently, a few works extend counterfactual fairness to graph data, but most of them neglect the following facts that can lead to biases: 1) the sensitive attributes of each node's neighbors may causally affect the prediction w.r.t. this node; 2) the sensitive attributes may causally affect other features and the graph structure. To tackle these issues, in this paper, we propose a novel fairness notion - graph counterfactual fairness, which considers the biases led by the above facts. To learn node representations towards graph counterfactual fairness, we propose a novel framework based on counterfactual data augmentation. In this framework, we generate counterfactuals corresponding to perturbations on each node's and their neighbors' sensitive attributes. Then we enforce fairness by minimizing the discrepancy between the representations learned from the original graph and the counterfactuals for each node. Experimentsmore »on both synthetic and real-world graphs show that our framework outperforms the state-of-the-art baselines in graph counterfactual fairness, and also achieves comparable prediction performance.« less
  5. Abstract

    Graph-based learning and estimation are fundamental problems in various applications involving power, social, and brain networks, to name a few. While learning pair-wise interactions in network data is a well-studied problem, discovering higher-order interactions among subsets of nodes is still not yet fully explored. To this end, encompassing and leveraging (non)linear structural equation models as well as vector autoregressions, this paper proposes autoregressive graph Volterra models (AGVMs) that can capture not only the connectivity between nodes but also higher-order interactions presented in the networked data. The proposed overarching model inherits the identifiability and expressibility of the Volterra series. Furthermore, two tailored algorithms based on the proposed AGVM are put forth for topology identification and link prediction in distribution grids and social networks, respectively. Real-data experiments on different real-world collaboration networks highlight the impact of higher-order interactions in our approach, yielding discernible differences relative to existing methods.