skip to main content


Title: Certification and Trade-off of Multiple Fairness Criteria in Graph-based Spam Detection
Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training.  more » « less
Award ID(s):
1931042 2008155 1909879
NSF-PAR ID:
10288834
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 30th ACM International Conference on Information and Knowledge Management
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training. 
    more » « less
  2. Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning, such as simultaneously minimizing classification and fairness losses. At an optimum, further optimizing one objective will necessarily increase at least another objective, and decision-makers need to comprehensively explore multiple optima to pin-point one final solution. We address the efficiency of exploring the Pareto front that contains all optima. First, stochastic multi-gradient descent (SMGD) takes time to converge to the Pareto front with large neural networks and datasets. Instead, we explore the Pareto front as a manifold from a few initial optima, based on a predictor-corrector method. Second, for each exploration step, the predictor iteratively solves a large-scale linear system that scales quadratically in the number of model parameters, and requires one backpropagation to evaluate a second-order Hessian-vector product per iteration of the solver. We propose a Gauss-Newton approximation that scales linearly, and that requires only first-order inner-product per iteration. T hird, we explore different linear system solvers, including the MINRES and conjugate gradient methods for approximately solving the linear systems. The innovations make predictor-corrector efficient for large networks and datasets. Experiments on a fair misinformation detection task show that 1) the predictor-corrector method can find Pareto fronts better than or similar to SMGD with less time, and 2) the proposed first-order method does not harm the quality of the Pareto front identified by the second-order method, while further reducing running time. 
    more » « less
  3. Graphs are widely found in social network analysis and e-commerce, where Graph Neural Networks (GNNs) are the state-of the-art model. GNNs can be biased due to sensitive attributes and network topology. With existing work that learns a fair node representation or adjacency matrix, achieving a strong guarantee of group fairness while preserving prediction accuracy is still challenging, with the fairness-accuracy trade-off remaining obscure to human decision-makers. We first define and analyze a novel upper bound of group fairness to optimize the adjacency matrix for fairness without significantly h arming prediction accuracy. To understand the nuance of fairness-accuracy tradeoff, we further propose macroscopic and microscopic explanation methods to reveal the trade-offs and the space that one can exploit. The macroscopic explanation method is based on stratified sampling and linear programming to deterministically explain the dynamics of the group fairness and prediction accuracy. Driving down to the microscopic level, we propose a path-based explanation that reveals how network topology leads to the tradeoff. On seven graph datasets, we demonstrate the novel upper bound can achieve more efficient fairness-accuracy trade-offs and the intuitiveness of the explanation methods can clearly pinpoint where the trade-off is improved. 
    more » « less
  4. null (Ed.)
    While real-world decisions involve many competing objectives, algorithmic decisions are often evaluated with a single objective function. In this paper, we study algorithmic policies which explicitly trade off between a private objective (such as profit) and a public objective (such as social welfare). We analyze a natural class of policies which trace an empirical Pareto frontier based on learned scores, and focus on how such decisions can be made in noisy or data-limited regimes. Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies. We then present empirical results in two different contexts — online content recommendation and sustainable abalone fisheries — to underscore the generality of our approach to a wide range of practical decisions. Taken together, these results shed light on inherent trade-offs in using machine learning for decisions that impact social welfare. 
    more » « less
  5. null (Ed.)
    While real-world decisions involve many competing objectives, algorithmic decisions are often evaluated with a single objective function. In this paper, we study algorithmic policies which explicitly trade off between a private objective (such as profit) and a public objective (such as social welfare). We analyze a natural class of policies which trace an empirical Pareto frontier based on learned scores, and focus on how such decisions can be made in noisy or data-limited regimes. Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies. We then present empirical results in two different contexts — online content recommendation and sustainable abalone fisheries — to underscore the generality of our approach to a wide range of practical decisions. Taken together, these results shed light on inherent trade-offs in using machine learning for decisions that impact social welfare. 
    more » « less