The analysis of social and biological networks often involves model- ing clusters of interest as cliques or their graph-theoretic generaliza- tions. The 𝑘-club model, which relaxes the requirement of pairwise adjacency in a clique to length-bounded paths inside the cluster, has been used to model cohesive subgroups in social networks and functional modules/complexes in biological networks. However, if the graphs are time-varying, or if they change under different conditions, we may be interested in clusters that preserve their property over time or under changes in conditions. To model such clusters that are conserved in a collection of graphs, we consider a cross-graph 𝑘-club model, a subset of nodes that forms a 𝑘-club in every graph in the collection. In this paper, we consider the canonical optimization problem of finding a cross-graph 𝑘-club of maximum cardinality. We introduce algorithmic ideas to solve this problem and evaluate their performance on some benchmark instances. Published in: Proceedings of The International Network Optimization Conference (INOC) 2022, Aachen, Germany
more »
« less
Finding conserved low‐diameter subgraphs in social and biological networks
Abstract The analysis of social and biological networks often involves modeling clusters of interest ascliquesor their graph‐theoretic generalizations. The ‐club model, which relaxes the requirement of pairwise adjacency in a clique to length‐bounded paths inside the cluster, has been used to model cohesive subgroups in social networks and functional modules or complexes in biological networks. However, if the graphs are time‐varying, or if they change under different conditions, we may be interested in clusters that preserve their property over time or under changes in conditions. To model such clusters that are conserved in a collection of graphs, we consider across‐graph‐clubmodel, a subset of nodes that forms a ‐club in every graph in the collection. In this article, we consider the canonical optimization problem of finding a cross‐graph ‐club of maximum cardinality in a graph collection. We develop integer programming approaches to solve this problem. Specifically, we introduce strengthened formulations, valid inequalities, and branch‐and‐cut algorithms based on delayed constraint generation. The results of our computational study indicate the significant benefits of using the approaches we introduce.
more »
« less
- Award ID(s):
- 2145553
- PAR ID:
- 10560982
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Networks
- Volume:
- 84
- Issue:
- 4
- ISSN:
- 0028-3045
- Page Range / eLocation ID:
- 509 to 527
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Cliques and their generalizations are frequently used to model “tightly knit” clusters in graphs and identifying such clusters is a popular technique used in graph-based data mining. One such model is the s-club, which is a vertex subset that induces a subgraph of diameter at most s. This model has found use in a variety of fields because low-diameter clusters have practical significance in many applications. As this property is not hereditary on vertex-induced subgraphs, the diameter of a subgraph could increase upon the removal of some vertices and the subgraph could even become disconnected. For example, star graphs have diameter two but can be disconnected by removing the central vertex. The pursuit of a fault-tolerant extension of the s-club model has spawned two variants that we study in this article: robust s-clubs and hereditary s-clubs. We analyze the complexity of the verification and optimization problems associated with these variants. Then, we propose cut-like integer programming formulations for both variants whenever possible and investigate the separation complexity of the cut-like constraints. We demonstrate through our extensive computational experiments that the algorithmic ideas we introduce enable us to solve the problems to optimality on benchmark instances with several thousand vertices. This work lays the foundations for effective mathematical programming approaches for finding fault-tolerant s-clubs in large-scale networks. History: Accepted by David Alderson, Area Editor for Network Optimization: Algorithms & Applications. Funding: The computing for this project was performed at the High Performance Computing Center at Oklahoma State University supported in part through the National Science Foundation [Grant OAC-1531128]. This material is based upon work supported by the National Science Foundation under [Grants 1662757 and 1942065]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/ijoc.2022.1231 .more » « less
-
Graph clustering is a fundamental problem in social network analysis, the goal of which is to group vertices of a graph into a series of densely knitted clusters with each cluster well separated from all the others. Classical graph clustering methods take advantage of the graph topology to model and quantify vertex proximity. With the proliferation of rich graph contents, such as user profiles in social networks, and gene annotations in protein interaction networks, it is essential to consider both the structure and content information of graphs for high-quality graph clustering. In this paper, we propose a graph embedding approach to clustering content-enriched graphs. The key idea is to embed each vertex of a graph into a continuous vector space where the localized structural and attributive information of vertices can be encoded in a unified, latent representation. Specifically, we quantify vertex-wise attribute proximity into edge weights, and employ truncated, attribute-aware random walks to learn the latent representations for vertices. We evaluate our attribute-aware graph embedding method in real-world attributed graphs, and the results demonstrate its effectiveness in comparison with state-of-the-art algorithms.more » « less
-
Graphs are ubiquitous structures found in numerous real-world applications, such as drug discovery, recommender systems, and social network analysis. To model graph-structured data, graph neural networks (GNNs) have become a popular tool. However, existing GNN architectures encounter challenges in cross-graph learning where multiple graphs have different feature spaces. To address this, recent approaches introduce text-attributed graphs (TAGs), where each node is associated with a textual description, which can be projected into a unified feature space using textual encoders. While promising, this method relies heavily on the availability of text-attributed graph data, which is difficult to obtain in practice. To bridge this gap, we propose a novel method named Topology-Aware Node description Synthesis (TANS), leveraging large language models (LLMs) to convert existing graphs into text-attributed graphs. The key idea is to integrate topological information into LLMs to explain how graph topology influences node semantics. We evaluate our TANS on text-rich, text-limited, and text-free graphs, demonstrating its applicability. Notably, on text-free graphs, our method significantly outperforms existing approaches that manually design node features, showcasing the potential of LLMs for preprocessing graph-structured data in the absence of textual information. The code and data are available at https://github.com/Zehong-Wang/TANS.more » « less
-
We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $$t\diverge$$) it pulls only the arm with the highest expected reward. While this goal is provably impossible for an isolated individual due to bounded memory, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion (i.e., communication) as long as the communication networks/graphs satisfy some mild conditions. In this work, we model and analyze a type of learning dynamics which are well-observed in social groups. Specifically, under the learning dynamics of interest, an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestion provided by a randomly chosen neighbor. To deal with the interplay between the randomness in the rewards and in the social interaction, we employ the \em mean-field approximation method. Considering the possibility that the individuals in the networks may not be exchangeable when the communication networks are not cliques, we go beyond the classic mean-field techniques and apply a refined version of mean-field approximation: Using coupling we show that, if the communication graph is connected and is either regular or has doubly-stochastic degree-weighted adjacency matrix, with probability → 1 as the social group size N → ∞, every individual in the social group learns the best option. If the minimum degree of the graph diverges as N → ∞, over an arbitrary but given finite time horizon, the sample paths describing the opinion evolutions of the individuals are asymptotically independent. In addition, the proportions of the population with different opinions converge to the unique solution of a system of ODEs. Interestingly, the obtained system of ODEs are invariant to the structures of the communication graphs. In the solution of the obtained ODEs, the proportion of the population holding the correct opinion converges to 1 exponentially fast in time. Notably, our results hold even if the communication graphs are highly sparse.more » « less
An official website of the United States government

