Link prediction is one of the fundamental problems in social network analysis. A common set of techniques for link prediction rely on similarity metrics which use the topology of the observed subnetwork to quantify the likelihood of unobserved links. Recently, similarity metrics for link prediction have been shown to be vulnerable to attacks whereby observations about the network are adversarially modified to hide target links. We propose a novel approach for increasing robustness of similarity-based link prediction by endowing the analyst with a restricted set of reliable queries which accurately measure the existence of queried links. The analyst aims to robustly predict a collection of possible links by optimally allocating the reliable queries. We formalize the analyst problem as a Bayesian Stackelberg game in which they first choose the reliable queries, followed by an adversary who deletes a subset of links among the remaining (unreliable) queries by the analyst. The analyst in our model is uncertain about the particular target link the adversary attempts to hide, whereas the adversary has full information about the analyst and the network. Focusing on similarity metrics using only local information, we show that the problem is NP-Hard for both players, and devise two principled and efficient approaches for solving it approximately. Extensive experiments with real and synthetic networks demonstrate the effectiveness of our approach.
more »
« less
Adversarial Link Prediction in Spatial Networks
Social networks arise as a result of complex interactions among people, and homophily plays an important role in this process. If we view homophily as a dominant force in network formation and associate each node with a collection of features, this process gives rise to spatial networks, with likelihood of an edge an increasing function of feature similarity among its incident nodes. A link prediction problem in such spatial networks then amounts to deter- mining whether the pair of nodes are sufficiently close according to this edge likelihood function. We undertake the first algorithmic study of the adversarial side of this problem in which the adversary manipulates features of a subset of nodes on the network to pre- vent predicting target edges. We show that this problem is NP-hard, even if the edge likelihood function is convex. On the other hand, if this function is convex, we show that the problem can be solved with convex programming when the set of nodes that the adversary needs to manipulate is fixed. Furthermore, if the edge likelihood function is linear, we present approximation algorithms for the case when the features are binary, and we wish to hide only a single edge, and for the case when the features are real-valued but we need to hide an arbitrary collection of edges.
more »
« less
- PAR ID:
- 10440486
- Date Published:
- Journal Name:
- AAMAS Conference proceedings
- ISSN:
- 2523-5699
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The 2020 United States (US) presidential election was — and has continued to be — the focus of pervasive and persistent mis- and disinformation spreading through our media ecosystems, including social media. This event has driven the collection and analysis of large, directed social network datasets, but such datasets can resist intuitive understanding. In such large datasets, the overwhelming number of nodes and edges present in typical representations create visual artifacts, such as densely overlapping edges and tightly-packed formations of low-degree nodes, which obscure many features of more practical interest. We apply a method, coengagement transformations, to convert such networks of social data into tractable images. Intuitively, this approach allows for parameterized network visualizations that make shared audiences of engaged viewers salient to viewers. Using the interpretative capabilities of this method, we perform an extensive case study of the 2020 United States presidential election on Twitter, contributing an empirical analysis of coengagement. By creating and contrasting different networks at different parameter sets, we define and characterize several structures in this discourse network, including bridging accounts, satellite audiences, and followback communities. We discuss the importance and implications of these empirical network features in this context. In addition, we release open-source code for creating coengagement networks from Twitter and other structured interaction data.more » « less
-
null (Ed.)Ranking on networks plays an important role in many high-impact applications, including recommender systems, social network analysis, bioinformatics and many more. In the age of big data, a recent trend is to address the variety aspect of network ranking. Among others, two representative lines of research include (1) heterogeneous information network with different types of nodes and edges, and (2) network of networks with edges at different resolutions. In this paper, we propose a new network model named Network of Heterogeneous Information Networks (NeoHIN for short) that is capable of simultaneously modeling both different types of nodes/edges, and different edge resolutions. We further propose two new ranking algorithms on NeoHIN based on the cross-domain consistency principle. Experiments on synthetic and real-world networks show that our proposed algorithms are (1) effective, which outperform other existing methods, and (2) efficient, without additional time cost per iteration to their counterparts.more » « less
-
Communities are a common and widely studied structure in networks, typically assum- ing that the network is fully and correctly observed. In practice, network data are often collected by querying nodes about their connections. In some settings, all edges of a sam- pled node will be recorded, and in others, a node may be asked to name its connections. These sampling mechanisms introduce noise and bias, which can obscure the community structure and invalidate assumptions underlying standard community detection methods. We propose a general model for a class of network sampling mechanisms based on recording edges via querying nodes, designed to improve community detection for network data col- lected in this fashion. We model edge sampling probabilities as a function of both individual preferences and community parameters, and show community detection can be performed by spectral clustering under this general class of models. We also propose, as a special case of the general framework, a parametric model for directed networks we call the nomination stochastic block model, which allows for meaningful parameter interpretations and can be fitted by the method of moments. In this case, spectral clustering and the method of mo- ments are computationally ecient and come with theoretical guarantees of consistency. We evaluate the proposed model in simulation studies on unweighted and weighted net- works and under misspecified models. The method is applied to a faculty hiring dataset, discovering a meaningful hierarchy of communities among US business schools.more » « less
-
This paper addresses the problem of decentralized learning in the presence of data poisoning attacks. In this problem, we consider a collection of nodes connected through a network, each equipped with a local function. The objective is to compute the global minimizer of the aggregated local functions, in a decentralized manner, i.e., each node can only use its local function and data exchanged with nodes it is connected to. Moreover, each node is to agree on the said minimizer despite an adversary that can arbitrarily change the local functions of a fraction of the nodes. This problem setting has applications in robust learning, where nodes in a network are collectively training a model that minimizes the empirical loss with possibly attacked local data sets. In this paper, we propose a novel decentralized learning algorithm that enables all nodes to reach consensus on the optimal model, in the absence of attacks, and approximate consensus in the presence of data poisoning attacks.more » « less
An official website of the United States government

