skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Mutual Information in Community Detection with Covariate Information and Correlated Networks
We study the problem of community detection when there is covariate information about the node labels and one observes multiple correlated networks. We provide an asymptotic upper bound on the per-node mutual information as well as a heuristic analysis of a multivariate performance measure called the MMSE matrix. These results show that the combined effects of seemingly very different types of information can be characterized explicitly in terms of formulas involving low-dimensional estimation problems in additive Gaussian noise. Our analysis is supported by numerical simulations.  more » « less
Award ID(s):
1750362
PAR ID:
10139966
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
Page Range / eLocation ID:
602 to 607
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Representation learning of graph-structured data is challenging because both graph structure and node features carry important information. Graph Neural Networks (GNNs) provide an expressive way to fuse information from network structure and node features. However, GNNs are prone to adversarial attacks. Here we introduce Graph Information Bottleneck (GIB), an information-theoretic principle that optimally balances expressiveness and robustness of the learned representation of graph-structured data. Inheriting from the general Information Bottleneck (IB), GIB aims to learn the minimal sufficient representation for a given task by maximizing the mutual information between the representation and the target, and simultaneously constraining the mutual information between the representation and the input data. Different from the general IB, GIB regularizes the structural as well as the feature information. We design two sampling algorithms for structural regularization and instantiate the GIB principle with two new models: GIB-Cat and GIB-Bern, and demonstrate the benefits by evaluating the resilience to adversarial attacks. We show that our proposed models are more robust than state-of-the art graph defense models. GIB-based models empirically achieve up to 31% improvement with adversarial perturbation of the graph structure as well as node features. 
    more » « less
  2. Abstract We propose and unify classes of different models for information propagation over graphs. In a first class, propagation is modelled as a wave, which emanates from a set ofknownnodes at an initial time, to all otherunknownnodes at later times with an ordering determined by the arrival time of the information wave front. A second class of models is based on the notion of a travel time along paths between nodes. The time of information propagation from an initialknownset of nodes to a node is defined as the minimum of a generalised travel time over subsets of all admissible paths. A final class is given by imposing a local equation of an eikonal form at eachunknownnode, with boundary conditions at theknownnodes. The solution value of the local equation at a node is coupled to those of neighbouring nodes with lower values. We provide precise formulations of the model classes and prove equivalences between them. Finally, we apply the front propagation models on graphs to semi-supervised learning via label propagation and information propagation on trust networks. 
    more » « less
  3. This article introduces a novel mutual informationbased measure to assess the glass ceiling effect in preferential attachment networks, which advances the analysis of inequalities in attributed networks. Using Shannon entropy and generalizing to Rényi entropy, our measure evaluates the conditional probability distributions of node attributes given the node degrees of adjacent nodes, which offers a more nuanced understanding of inequality compared to traditional methods that emphasize node degree distributions and degree assortativity alone. To evaluate the efficacy of the proposed measure, we evaluate it using an analytical structural inequality model as well as historical publication data. Results show that our mutual information measure aligns well with both the theoretical model and empirical data, underscoring its reliability as a robust approach for capturing inequalities in attributed networks. Moreover, we introduce a novel stochastic optimization algorithm that utilizes a parameterized conditional logit model for edge addition. Our algorithm is shown to outperform the baseline uniform distribution based approach in mitigating the glass ceiling effect. By strategically recommending links based on this algorithm, we can effectively hinder the glass ceiling effect within networks. 
    more » « less
  4. Agent-based crawlers are commonly used in network maintenance and information gathering. In order not to disturb the main functionality of the system, whether acting at nodes or being in transit, they need to operate online, perform a single operation fast and use small memory. They should also be preferably deterministic, as crawling agents have limited capabilities of generating a large number of truly random bits. We consider a system in which an agent receives an update, typically an insertion or deletion, of some information upon visiting a node. On request, the agent needs to output hot information, i.e., with the net occurrence above certain frequency threshold. A desired time and memory complexity of such agent should be poly-logarithmic in the number of visited nodes and inversely proportional to the frequency threshold. Ours is the first such agent with rigorous analysis and a complementary almost-matching lower bound. 
    more » « less
  5. null (Ed.)
    This work proposes a new unsupervised (or self-supervised) node representation learning method that aims to leverage the coarse-grain information that is available in most graphs. This extends previous attempts that only leverage fine-grain information (similarities within local neighborhoods) or global graph information (similarities across all nodes). Intuitively, the proposed method identifies nodes that belong to the same clusters and maximizes their mutual information. Thus, coarse-grain (cluster-level) similarities that are shared between nodes are preserved in their representations. The core components of the proposed method are (i) a jointly optimized clustering of nodes during learning and (ii) an Infomax objective term that preserves the mutual information among nodes of the same clusters. Our method is able to outperform competing state-of-art methods in various downstream tasks, such as node classification, link prediction, and node clustering. Experiments show that the average gain is between 0.2% and 6.1%, over the best competing approach, over all tasks. Our code is publicly available at: https://github.com/cmavro/Graph-InfoClust-GIC. 
    more » « less