User-generated product reviews are essential for online platforms like Amazon and Yelp. However, the presence of fake reviews misleads customers. GNN is the state-of-the-art method that detects suspicious reviewers by exploiting the topologies of the graph connecting reviewers, reviews, and products. Nevertheless, the discrepancy in the detection accuracy over different groups of reviewers degrades reviewer engagement and customer trust in the review websites. Unlike the previous belief that the difference between the groups causes unfairness, we study the subgroup structures within the groups that can also cause discrepancies in treating different groups. This paper addresses the challenges of defining, approximating, and utilizing a new subgroup structure for fair spam detection. We first identify subgroup structures in the review graph that lead to discrepant accuracy in the groups. The complex dependencies over the review graph create difficulties in teasing out subgroups hidden within larger groups. We design a model that can be trained to jointly infer the hidden subgroup memberships and exploits the membership for calibrating the detection accuracy across groups. Comprehensive comparisons against baselines on three large Yelp review datasets demonstrate that the subgroup membership can be identified and exploited for group fairness.
more »
« less
Robust Spammer Detection by Nash Reinforcement Learning
Online reviews provide product evaluations for customers to makedecisions. Unfortunately, the evaluations can be manipulated us-ing fake reviews (“spams”) by professional spammers, who havelearned increasingly insidious and powerful spamming strategiesby adapting to the deployed detectors. Spamming strategies arehard to capture, as they can be varying quickly along time, differentacross spammers and target products, and more critically, remainedunknown in most cases. Furthermore, most existing detectors focuson detection accuracy, which is not well-aligned with the goal ofmaintaining the trustworthiness of product evaluations. To addressthe challenges, we formulate a minimax game where the spammersand spam detectors compete with each other on their practical goalsthat are not solely based on detection accuracy. Nash equilibria ofthe game lead to stable detectors that are agnostic to any mixeddetection strategies. However, the game has no closed-form solu-tion and is not differentiable to admit the typical gradient-basedalgorithms. We turn the game into two dependent Markov Deci-sion Processes (MDPs) to allow efficient stochastic optimizationbased on multi-armed bandit and policy gradient. We experimenton three large review datasets using various state-of-the-art spam-ming and detection strategies and show that the optimization al-gorithm can reliably find an equilibrial detector that can robustlyand effectively prevent spammers with any mixed spamming strate-gies from attaining their practical goal. Our code is available athttps://github.com/YingtongDou/Nash-Detect.
more »
« less
- Award ID(s):
- 1931042
- PAR ID:
- 10184259
- Date Published:
- Journal Name:
- The 26th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'2020)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that the feedback Nash equilibrium concept is more expressive and encodes more complex behavior. It is desirable to develop specific tools for inferring players’ objectives in feedback games. Therefore, we consider the dynamic game cost inference problem under the feedback information pattern, using only partial state observations and incomplete trajectory data. To this end, we first propose an inverse feedback game loss function, whose minimizer yields a feedback Nash equilibrium state trajectory closest to the observa- tion data. We characterize the landscape and differentiability of the loss function. Given the difficulty of obtaining the exact gradient, our main contribution is an efficient gradient approximator, which enables a novel inverse feedback game solver that minimizes the loss using first-order optimization. In thorough empirical evaluations, we demonstrate that our algorithm converges reliably and has better robustness and generalization performance than the open-loop baseline method when the observation data reflects a group of players acting in a feedback Nash game.more » « less
-
Automated marker makers (AMMs) are decentralized exchanges that enable the automated trading of digital assets. Liquidity providers (LPs) deposit digital tokens, which can be used by traders to execute trades, which generate fees for the investing LPs. In AMMs, trade prices are determined algorithmically, unlike classical limit order books. Concentrated liquidity market makers (CLMMs) are a major class of AMMs that offer liquidity providers flexibility to decide not onlyhow muchliquidity to provide, butin what ranges of pricesthey want the liquidity to be used. This flexibility can complicate strategic planning, since fee rewards are shared among LPs. We formulate and analyze a game theoretic model to study the incentives of LPs in CLMMs. Our main results show that while our original formulation admits multiple Nash equilibria and has complexity quadratic in the number of price ticks in the contract, it can be reduced to a game with a unique Nash equilibrium whose complexity is only linear. We further show that the Nash equilibrium of this simplified game follows a waterfilling strategy, in which low-budget LPs use up their full budget, but rich LPs do not. Finally, by fitting our game model to real-world CLMMs, we observe that in liquidity pools with risky assets, LPs adopt investment strategies far from the Nash equilibrium. Under price uncertainty, they generally invest in fewer and wider price ranges than our analysis suggests, with lower-frequency liquidity updates. In such risky pools, by updating their strategy to more closely match the Nash equilibrium of our game, LPs can improve their median daily returns by $116, which corresponds to an increase of 0.009% in median daily return on investment (ROI). At maximum, LPs can improve daily ROI by 0.855% when they reach Nash equilibrium. In contrast, in stable pools (e.g., of only stablecoins), LPs already adopt strategies that more closely resemble the Nash equilibrium of our game.more » « less
-
Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training.more » « less
-
Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training.more » « less
An official website of the United States government

