Network connectivity optimization, which aims to manipulate network connectivity by changing its underlying topology, is a fundamental task behind a wealth of high-impact data mining applications, ranging from immunization, critical infrastructure construction, social collaboration mining, bioinformatics analysis, to intelligent transportation system design. To tackle its exponential computation complexity, greedy algorithms have been extensively used for network connectivity optimization by exploiting its diminishing returns property. Despite the empirical success, two key challenges largely remain open. First, on the theoretic side, the hardness, as well as the approximability of the general network connectivity optimization problem are still nascent except for a few special instances. Second, on the algorithmic side, current algorithms are often hard to balance between the optimization quality and the computational efficiency. In this paper, we systematically address these two challenges for the network connectivity optimization problem. First, we reveal some fundamental limits by proving that, for a wide range of network connectivity optimization problems, (1) they are NP-hard and (2) (1-1/e) is the optimal approximation ratio for any polynomial algorithms. Second, we propose an effective, scalable and general algorithm (CONTAIN) to carefully balance the optimization quality and the computational efficiency.
more »
« less
Efficient Algorithms towards Network Intervention
Research suggests that social relationships have substantial impacts on individuals’ health outcomes. Network intervention, through careful planning, can assist a network of users to build healthy relationships. However, most previous work is not designed to assist such planning by carefully examining and improving multiple network characteristics. In this paper, we propose and evaluate algorithms that facilitate network intervention planning through simultaneous optimization of network degree, closeness, betweenness, and local clustering coefficient, under scenarios involving Network Intervention with Limited Degradation - for Single target (NILD-S) and Network Intervention with Limited Degradation - for Multiple targets (NILD-M). We prove that NILD-S and NILD-M are NP-hard and cannot be approximated within any ratio in polynomial time unless P=NP. We propose the Candidate Re-selection with Preserved Dependency (CRPD) algorithm for NILD-S, and the Objective-aware Intervention edge Selection and Adjustment (OISA) algorithm for NILD-M. Various pruning strategies are designed to boost the efficiency of the proposed algorithms. Extensive experiments on various real social networks collected from public schools and Web and an empirical study are conducted to show that CRPD and OISA outperform the baselines in both efficiency and effectiveness.
more »
« less
- PAR ID:
- 10170813
- Date Published:
- Journal Name:
- Proceedings of The Web Conference 2020
- Page Range / eLocation ID:
- 2021 to 2031
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In many real-world applications such as social network analysis and online marketing/advertising, community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social networks. Inspired by this, in this paper, we investigate the influence propagation from some seed communities and their influential effects that result in the influenced communities. We propose a novel problem, named Top-L most Influential Community DEtection (TopL-ICDE) over social networks, which aims to retrieve top-L seed communities with the highest influences, having high structural cohesiveness, and containing user-specified query keywords. To efficiently tackle the TopL-ICDE problem, we design effective pruning strategies to filter out false alarms of seed communities and propose an effective index mechanism to facilitate efficient Top-L community retrieval. We develop an efficient TopL-ICDE answering algorithm by traversing the index and applying our proposed pruning strategies. We also formulate and tackle a variant of TopL-ICDE, named diversified top-L most influential community detection (DTopL-ICDE), which returns a set of L diversified communities with the highest diversity score (i.e., collaborative influences by L communities). We prove that DTopL-ICDE is NP-hard, and propose an efficient greedy algorithm with our designed diversity score pruning. Through extensive experiments, we verify the efficiency and effectiveness of our proposed TopL-ICDE and DTopL-ICDE approaches over real/synthetic social networks under various parameter settings.more » « less
-
Content spread inequity is a potential unfairness issue in online social networks, disparately impacting minority groups. In this paper, we view friendship suggestion, a common feature in social network platforms, as an opportunity to achieve an equitable spread of content. In particular, we propose to suggest a subset of potential edges (currently not existing in the network but likely to be accepted) that maximizes content spread while achieving fairness. Instead of re-engineering the existing systems, our proposal builds a fairness wrapper on top of the existing friendship suggestion components. We prove the problem is NP-hard and inapproximable in polynomial time unless P=NP. Therefore, allowing relaxation of the fairness constraint, we propose an algorithm based on LP-relaxation and randomized rounding with fixed approximation ratios on fairness and content spread. We provide multiple optimizations, further improving the performance of our algorithm in practice. Besides, we propose a scalable algorithm that dynamically adds subsets of nodes, chosen via iterative sampling, and solves smaller problems corresponding to these nodes. Besides theoretical analysis, we conduct comprehensive experiments on real and synthetic data sets. Across different settings, our algorithms found solutions with near-zero unfairness while significantly increasing the content spread. Our scalable algorithm could process a graph with half a million nodes on a single machine, reducing the unfairness to around 0.0004 while lifting content spread by 43%.more » « less
-
We study the Bayesian multi-task variable selection problem, where the goal is to select activated variables for multiple related data sets simultaneously. We propose a new variational Bayes algorithm which generalizes and improves the recently developed “sum of single effects” model of Wang et al. (2020a). Motivated by differential gene network analysis in biology, we further extend our method to joint structure learning of multiple directed acyclic graphical models, a problem known to be computationally highly challenging. We propose a novel order MCMC sampler where our multi-task variable selection algorithm is used to quickly evaluate the posterior probability of each ordering. Both simulation studies and real gene expression data analysis are conducted to show the efficiency of our method. Finally, we also prove a posterior consistency result for multi-task variable selection, which provides a theoretical guarantee for the proposed algorithms. Supplementary materials for this article are available online.more » « less
-
Elastic optical networks (EONs) are able to provide high spectrum utilization efficiency due to flexibility in resource assignment. In translucent EONs, by employing regenerators and using advanced modulation formats for transmission, spectrum efficiency can be further improved. Survivability is regarded as an important aspect of EONs, and p-cycle protection is considered to be an attractive scheme due to its fast restoration and high protection efficiency. In this paper, we propose methods for evaluating and selecting p-cycles for both link protection (LP) and failure-independent path protection (FIPP) to survive single-link failures. After considering the various factors that affect the performance of a p-cycle, we propose two evaluation metrics for LP and FIPP, namely, individual p-cycle cost and set of cycles cost. Based on these metrics, we propose two algorithms for selecting a set of p-cycles in translucent EONs: Traffic Independent P-cycle Selection (TIPS), which selects a set of cycles without knowledge of the traffic, and Traffic-Oriented P-cycle Selection (TOPS), which takes given traffic information into account. A routing and spectrum assignment algorithm is designed for translucent EONs, and our p-cycle design algorithms are evaluated using both static and dynamic traffic models. Simulation results show that the proposed algorithms have better performance than commonly used baseline algorithms. We also compare the performance of LP p-cycles and FIPP p-cycles.more » « less
An official website of the United States government

