Robust Spammer Detection by Nash Reinforcement Learning

Dou, Yingtong; Ma, Guixiang; Yu, Philip S; Xie, Sihong

doi:10.1145/3394486.3403135

Online reviews provide product evaluations for customers to makedecisions. Unfortunately, the evaluations can be manipulated us-ing fake reviews (“spams”) by professional spammers, who havelearned increasingly insidious and powerful spamming strategiesby adapting to the deployed detectors. Spamming strategies arehard to capture, as they can be varying quickly along time, differentacross spammers and target products, and more critically, remainedunknown in most cases. Furthermore, most existing detectors focuson detection accuracy, which is not well-aligned with the goal ofmaintaining the trustworthiness of product evaluations. To addressthe challenges, we formulate a minimax game where the spammersand spam detectors compete with each other on their practical goalsthat are not solely based on detection accuracy. Nash equilibria ofthe game lead to stable detectors that are agnostic to any mixeddetection strategies. However, the game has no closed-form solu-tion and is not differentiable to admit the typical gradient-basedalgorithms. We turn the game into two dependent Markov Deci-sion Processes (MDPs) to allow efficient stochastic optimizationbased on multi-armed bandit and policy gradient. We experimenton three large review datasets using various state-of-the-art spam-ming and detection strategies and show that the optimization al-gorithm can reliably find an equilibrial detector that can robustlyand effectively prevent spammers with any mixed spamming strate-gies from attaining their practical goal. Our code is available athttps://github.com/YingtongDou/Nash-Detect.

More Like this