skip to main content

Title: Fair Graph Mining
In today’s increasingly connected world, graph mining plays a piv- otal role in many real-world application domains, including social network analysis, recommendations, marketing and financial secu- rity. Tremendous efforts have been made to develop a wide range of computational models. However, recent studies have revealed that many widely-applied graph mining models could suffer from potential discrimination. Fairness on graph mining aims to develop strategies in order to mitigate bias introduced/amplified during the mining process. The unique challenges of enforcing fairness on graph mining include (1) theoretical challenge on non-IID nature of graph data, which may invalidate the basic assumption behind many existing studies in fair machine learning, and (2) algorith- mic challenge on the dilemma of balancing model accuracy and fairness. This tutorial aims to (1) present a comprehensive review of state-of-the-art techniques in fairness on graph mining and (2) identify the open challenges and future trends. In particular, we start with reviewing the background, problem definitions, unique challenges and related problems; then we will focus on an in-depth overview of (1) recent techniques in enforcing group fairness, indi- vidual fairness and other fairness notions in the context of graph mining, and (2) future directions in studying algorithmic fairness on more » graphs. We believe this tutorial could be attractive to researchers and practitioners in areas including data mining, artificial intel- ligence, social science and beneficial to a plethora of real-world application domains. « less
Authors:
;
Award ID(s):
1939725 1947135
Publication Date:
NSF-PAR ID:
10332511
Journal Name:
CIKM
Page Range or eLocation-ID:
4849 to 4852
Sponsoring Org:
National Science Foundation
More Like this
  1. Graph is a ubiquitous type of data that appears in many real-world applications, including social network analysis, recommendations and financial security. Important as it is, decades of research have developed plentiful computational models to mine graphs. Despite its prosperity, concerns with respect to the potential algorithmic discrimination have been grown recently. Algorithmic fairness on graphs, which aims to mitigate bias introduced or amplified during the graph mining process, is an attractive yet challenging research topic. The first challenge corresponds to the theoretical challenge, where the non-IID nature of graph data may not only invalidate the basic assumption behind many existing studies in fair machine learning, but also introduce new fairness definition(s) based on the inter-correlation between nodes rather than the existing fairness definition(s) in fair machine learning. The second challenge regarding its algorithmic aspect aims to understand how to balance the trade-off between model accuracy and fairness. This tutorial aims to (1) comprehensively review the state-of-the-art techniques to enforce algorithmic fairness on graphs and (2) enlighten the open challenges and future directions. We believe this tutorial could benefit researchers and practitioners from the areas of data mining, artificial intelligence and social science.
  2. Networks (i.e., graphs) are often collected from multiple sources and platforms, such as social networks extracted from multiple online platforms, team-specific collaboration networks within an organization, and inter-dependent infrastructure networks, etc. Such networks from different sources form the multi-networks, which can exhibit the unique patterns that are invisible if we mine the individual network separately. However, compared with single-network mining, multi-network mining is still under-explored due to its unique challenges. First ( multi-network models ), networks under different circumstances can be modeled into a variety of models. How to properly build multi-network models from the complex data? Second ( multi-network mining algorithms ), it is often nontrivial to either extend single-network mining algorithms to multi-networks or design new algorithms. How to develop effective and efficient mining algorithms on multi-networks? The objectives of this tutorial are to: (1) comprehensively review the existing multi-network models, (2) elaborate the techniques in multi-network mining with a special focus on recent advances, and (3) elucidate open challenges and future research directions. We believe this tutorial could be beneficial to various application domains, and attract researchers and practitioners from data mining as well as other interdisciplinary fields.
  3. Graph-structured data naturally appear in numerous application domains, ranging from social analysis, bioinformatics to computer vision. The unique capability of graphs enables capturing the structural relations among data, and thus allows to harvest more insights compared to analyzing data in isolation. However, graph mining is a challenging task due to the underlying complex and diverse connectivity patterns. A potential solution is to learn the representation of a graph in a low-dimensional Euclidean space via embedding techniques that preserve the graph properties. Although tremendous efforts have been made to address the graph representation learning problem, many of them still suffer from their shallow learning mechanisms. On the other hand, deep learning models on graphs have recently emerged in both machine learning and data mining areas and demonstrated superior performance for various problems. In this survey, we conduct a comprehensive review specifically on the emerging field of graph convolutional networks, which is one of the most prominent graph deep learning models. We first introduce two taxonomies to group the existing works based on the types of convolutions and the areas of applications, then highlight some graph convolutional network models in details. Finally, we present several challenges in this area and discuss potentialmore »directions for future research.« less
  4. Relevance to proposal: This project evaluates the generalizability of real and synthetic training datasets which can be used to train model-free techniques for multi-agent applications. We evaluate different methods of generating training corpora and machine learning techniques including Behavior Cloning and Generative Adversarial Imitation Learning. Our results indicate that the utility-guided selection of representative scenarios to generate synthetic data can have significant improvements on model performance. Paper abstract: Crowd simulation, the study of the movement of multiple agents in complex environments, presents a unique application domain for machine learning. One challenge in crowd simulation is to imitate the movement of expert agents in highly dense crowds. An imitation model could substitute an expert agent if the model behaves as good as the expert. This will bring many exciting applications. However, we believe no prior studies have considered the critical question of how training data and training methods affect imitators when these models are applied to novel scenarios. In this work, a general imitation model is represented by applying either the Behavior Cloning (BC) training method or a more sophisticated Generative Adversarial Imitation Learning (GAIL) method, on three typical types of data domains: standard benchmarks for evaluating crowd models, random samplingmore »of state-action pairs, and egocentric scenarios that capture local interactions. Simulated results suggest that (i) simpler training methods are overall better than more complex training methods, (ii) training samples with diverse agent-agent and agent-obstacle interactions are beneficial for reducing collisions when the trained models are applied to new scenarios. We additionally evaluated our models in their ability to imitate real world crowd trajectories observed from surveillance videos. Our findings indicate that models trained on representative scenarios generalize to new, unseen situations observed in real human crowds.« less
  5. Social recommendation task aims to predict users' preferences over items with the incorporation of social connections among users, so as to alleviate the sparse issue of collaborative filtering. While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users’ social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques. To tackle the above challenges, this work proposes a Knowledge-aware Coupled Graph Neural Network (KCGN) that jointly injects the inter-dependent knowledge across items and users into the recommendation framework. KCGN enables the high-order user- and item-wise relation encoding by exploiting the mutual information for global graph structure awareness. Additionally, we further augment KCGN with the capability of capturing dynamic multi-typed user-item interactive patterns. Experimental studies on real-world datasets show the effectiveness of our method against many strong baselines in a variety of settings. Source codes are available at: https://github.com/xhcdream/KCGN.