skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 31, 2025

Title: Estimating Exposure to Information on Social Networks
Estimating exposure to information on a social network is a problem with important consequences for our society. The exposure estimation problem involves finding the fraction of people on the network who have been exposed to a piece of information (e.g., a URL of a news article on Facebook, a hashtag on Twitter). The exact value of exposure to a piece of information is determined by two features: the structure of the underlying social network and the set of people who shared the piece of information. Often, both features are not publicly available (i.e., access to the two features is limited only to the internal administrators of the platform) and are difficult to estimate from data. As a solution, we propose two methods to estimate the exposure to a piece of information in an unbiased manner: a vanilla method that is based on sampling the network uniformly and a method that non-uniformly samples the network motivated by the Friendship Paradox. We provide theoretical results that characterize the conditions (in terms of properties of the network and the piece of information) under which one method outperforms the other. Further, we outline extensions of the proposed methods to dynamic information cascades (where the exposure needs to be tracked in real time). We demonstrate the practical feasibility of the proposed methods via experiments on multiple synthetic and real-world datasets.  more » « less
Award ID(s):
2112457
PAR ID:
10607959
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Social Computing
Volume:
7
Issue:
1-4
ISSN:
2469-7818
Page Range / eLocation ID:
1 to 24
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. One of the challenges in social media research is that, often times, researchers or third parties could not obtain the massive of data collected by a limited number of “big brothers” (e.g., Facebook and Google). In this paper, we shed light on leveraging social network topological properties and local information to effectively conduct search in Online Social Networks (OSN). The problem we focus on is to discover the reachability of a group of target people in an OSN, particularly from the perspective of a third-party analyst who does not have full access to the OSN. We developed effective and efficient detection techniques which demand only a small number of queries to discover people's connections (e.g. friendship) in the OSN. After conducting experiments on real-world data sets, we found that our proposed techniques perform as well as the centralized detection algorithm, which assumes the availability of the global information in the OSN. 
    more » « less
  2. This paper deals with randomized polling of a social network. In the case of forecasting the outcome of an election between two candidates A and B, classical intent polling asks randomly sampled individuals: who will you vote for? Expectation polling asks: who do you think will win? In this paper, we propose a novel neighborhood expectation polling (NEP) strategy that asks randomly sampled individuals: what is your estimate of the fraction of votes for A? Therefore, in NEP, sampled individuals will naturally look at their neighbors (defined by the underlying social network graph) when answering this question. Hence, the mean squared error (MSE) of NEP methods rely on selecting the optimal set of samples from the network. To this end, we propose three NEP algorithms for the following cases: (i) the social network graph is not known but, random walks (sequential exploration) can be performed on the graph (ii) the social network graph is unknown. For case (i) and (ii), two algorithms based on a graph theoretic consequence called friendship paradox are proposed. Theoretical results on the dependence of the MSE of the algorithms on the properties of the network are established. Numerical results on real and synthetic data sets are provided to illustrate the performance of the algorithms. 
    more » « less
  3. Social networks, as an indispensable part of our daily lives, provide ideal platforms for entertainment and communication. However, the appearance of spammers who spread malicious information pollutes a network’s reliability. Unlike email spammers detection, a social network account has several types of attributes and complicated behavior patterns, which require a more sophisticated detection mechanism. To address the above challenges, we propose several efficient profiles and behavioral features to describe a social network account and a combined neural network to detect the spammers. The combined neural network can process the features separately based on their mutual correlation and handle data with missing features. In experiments, the combined neural network outperforms several classical machine learning approaches and achieves 97.5% accuracy on real data. The proposed features and the combined neural network have already been applied commercially. 
    more » « less
  4. In the recent years, reciprocal link prediction has received some attention from the data mining and social network analysis researchers, who solved this problem as a binary classification task. However, it is also important to predict the interval time for the creation of reciprocal link. This is a challenging problem for two reasons: First, the lack of effective features, because well-known link prediction features are designed for undirected networks and for the binary classification task, hence they do not work well for the interval time prediction; Second, the presence of censored data instances makes the traditional supervised regression methods unsuitable for solving this problem. In this paper, we propose a solution for the reciprocal link interval time prediction task. We map this problem into survival analysis framework and show through extensive experiments on real-world datasets that, survival analysis methods perform better than traditional regression, neural network based model and support vector regression (SVR). 
    more » « less
  5. null (Ed.)
    Recently, aligning users among different social networks has received significant attention. However, most of the existing studies do not consider users’ behavior information during the aligning procedure and thus still suffer from the poor learning performance. In fact, we observe that social network alignment and behavior analysis can benefit from each other. Motivated by such an observation, we propose to jointly study the social network alignment problem and user behavior analysis problem. We design a novel end-to-end framework named BANANA. In this framework, to leverage behavior analysis for social network alignment at the distribution level, we design an earth mover’s distance based alignment model to fuse users’ behavior information for more comprehensive user representations. To further leverage social network alignment for behavior analysis, in turn, we design a temporal graph neural network model to fuse behavior information in different social networks based on the alignment result. Two models above can work together in an end-to-end manner. Through extensive experiments on real-world datasets, we demonstrate that our proposed approach outperforms the state-of-the-art methods in the social network alignment task and the user behavior analysis task, respectively. 
    more » « less