skip to main content


Search for: All records

Creators/Authors contains: "Liu, Jiaxin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. User-generated product reviews are essential for online platforms like Amazon and Yelp. However, the presence of fake reviews misleads customers. GNN is the state-of-the-art method that detects suspicious reviewers by exploiting the topologies of the graph connecting reviewers, reviews, and products. Nevertheless, the discrepancy in the detection accuracy over different groups of reviewers degrades reviewer engagement and customer trust in the review websites. Unlike the previous belief that the difference between the groups causes unfairness, we study the subgroup structures within the groups that can also cause discrepancies in treating different groups. This paper addresses the challenges of defining, approximating, and utilizing a new subgroup structure for fair spam detection. We first identify subgroup structures in the review graph that lead to discrepant accuracy in the groups. The complex dependencies over the review graph create difficulties in teasing out subgroups hidden within larger groups. We design a model that can be trained to jointly infer the hidden subgroup memberships and exploits the membership for calibrating the detection accuracy across groups. Comprehensive comparisons against baselines on three large Yelp review datasets demonstrate that the subgroup membership can be identified and exploited for group fairness. 
    more » « less
    Free, publicly-accessible full text available August 6, 2024
  2. Social networks are frequently polluted by rumors, which can be detected by advanced models such as graph neural networks. However, the models are vulnerable to attacks, and discovering and understanding the vulnerabilities is critical to robust rumor detection. To discover subtle vulnerabilities, we design a attacking algorithm based on reinforcement learning to camouflage rumors against black-box detectors. We address exponentially large state spaces, high-order graph dependencies, and ranking dependencies, which are unique to the problem setting but fundamentally challenging for the state-of-the-art end-to-end approaches. We design domain-specific features that have causal effect on the reward, so that even a linear policy can arrive at powerful attacks with additional interpretability. To speed up policy optimization, we devise: (i) a credit assignment method that proportionally decomposes delayed and aggregated rewards to atomic attacking actions for enhance feature-reward associations; (ii) a time-dependent control variate to reduce prediction variance due to large state-action spaces and long attack horizon, based on reward variance analysis and a Bayesian analysis of the prediction distribution. On two real world datasets of rumor detection tasks, we demonstrate: (i) the effectiveness of the learned attacking policy on a wide spectrum of target models compared to both rule-based and end-to-end attacking approaches; (ii) the usefulness of the proposed credit assignment strategy and variance reduction components; (iii) the interpretability of the attacking policy. 
    more » « less
  3. Abstract

    Existing methods for fitting spatial autoregressive models have various strengths and weaknesses. For example, the maximum likelihood estimation (MLE) approach yields efficient estimates but is computationally burdensome. Computationally efficient methods, such as generalized method of moments (GMMs) and spatial two‐stage least squares (2SLS), typically require exogenous covariates to be significant, a restrictive assumption that may fail in practice. We propose a new estimating equation approach, termed combined moment equation (COME), which combines the first moment with covariance conditions on the residual terms. The proposed estimator is less computationally demanding than MLE and does not need the restrictive exogenous conditions as required by GMM and 2SLS. We show that the proposed estimator is consistent and establish its asymptotic distribution. Extensive simulations demonstrate that the proposed method outperforms the competitors in terms of bias, efficiency, and computation. We apply the proposed method to analyze an air pollution study, and obtain some interesting results about the spatial distribution of PM2.5 concentrations in Beijing.

     
    more » « less
  4. Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training. 
    more » « less
  5. Spamming reviews are prevalent in review systems to manipulate seller reputation and mislead customers. Spam detectors based on graph neural networks (GNN) exploit representation learning and graph patterns to achieve state-of-the-art detection accuracy. The detection can influence a large number of real-world entities and it is ethical to treat different groups of entities as equally as possible. However, due to skewed distributions of the graphs, GNN can fail to meet diverse fairness criteria designed for different parties. We formulate linear systems of the input features and the adjacency matrix of the review graphs for the certification of multiple fairness criteria. When the criteria are competing, we relax the certification and design a multi-objective optimization (MOO) algorithm to explore multiple efficient trade-offs, so that no objective can be improved without harming another objective. We prove that the algorithm converges to a Pareto efficient solution using duality and the implicit function theorem. Since there can be exponentially many trade-offs of the criteria, we propose a data-driven stochastic search algorithm to approximate Pareto fronts consisting of multiple efficient trade-offs. Experimentally, we show that the algorithms converge to solutions that dominate baselines based on fairness regularization and adversarial training. 
    more » « less