skip to main content

Title: A nature inspired modularity function for unsupervised learning involving spatially embedded networks

The quality of network clustering is often measured in terms of a commonly used metric known as “modularity”. Modularity compares the clusters found in a network to those present in a random graph (a “null model”). Unfortunately, modularity is somewhat ill suited for studying spatially embedded networks, since a random graph contains no basic geometrical notions. Regardless of their distance, the null model assigns a nonzero probability for an edge to appear between any pair of nodes. Here, we propose a variant of modularity that does not rely on the use of a null model. To demonstrate the essentials of our method, we analyze networks generated from granular ensemble. We show that our method performs better than the most commonly used Newman-Girvan (NG) modularity in detecting the best (physically transparent) partitions in those systems. Our measure further properly detects hierarchical structures, whenever these are present.

more » « less
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Dynamic community detection provides a coherent description of network clusters over time, allowing one to track the growth and death of communities as the network evolves. However, modularity maximization, a popular method for performing multilayer community detection, requires the specification of an appropriate null network as well as resolution and interlayer coupling parameters. Importantly, the ability of the algorithm to accurately detect community evolution is dependent on the choice of these parameters. In functional temporal networks, where evolving communities reflect changing functional relationships between network nodes, it is especially important that the detected communities reflect any state changes of the system. Here, we present analytical work suggesting that a uniform null network provides improved sensitivity to the detection of small evolving communities in temporal networks with positive edge weights bounded above by 1, such as certain types of correlation networks. We then propose a method for increasing the sensitivity of modularity maximization to state changes in nodal dynamics by modelling self-identity links between layers based on the self-similarity of the network nodes between layers. This method is more appropriate for functional temporal networks from both a modelling and mathematical perspective, as it incorporates the dynamic nature of network nodes. We motivate our method based on applications in neuroscience where network nodes represent neurons and functional edges represent similarity of firing patterns in time. We show that in simulated data sets of neuronal spike trains, updating interlayer links based on the firing properties of the neurons provides superior community detection of evolving network structure when groups of neurons change their firing properties over time. Finally, we apply our method to experimental calcium imaging data that monitors the spiking activity of hundreds of neurons to track the evolution of neuronal communities during a state change from the awake to anaesthetized state.

    more » « less
  2. Quantifying the differences between networks is a challenging and ever-present problem in network science. In recent years, a multitude of diverse, ad hoc solutions to this problem have been introduced. Here, we propose that simple and well-understood ensembles of random networks—such as Erdős–Rényi graphs, random geometric graphs, Watts–Strogatz graphs, the configuration model and preferential attachment networks—are natural benchmarks for network comparison methods. Moreover, we show that the expected distance between two networks independently sampled from a generative model is a useful property that encapsulates many key features of that model. To illustrate our results, we calculate this within-ensemble graph distance and related quantities for classic network models (and several parameterizations thereof) using 20 distance measures commonly used to compare graphs. The within-ensemble graph distance provides a new framework for developers of graph distances to better understand their creations and for practitioners to better choose an appropriate tool for their particular task. 
    more » « less
  3. Abstract

    Land‐use change is a significant cause of anthropogenic extinctions, which are likely to continue and accelerate as habitat conversion proceeds in most biomes. One way to understand the effects of habitat loss on biodiversity is through improved tools for predicting the number and identity of species losses in response to habitat loss. There are relatively few methods for predicting extinctions and even fewer opportunities for rigorously assessing the quality of these predictions. In this paper, we address these issues by applying a new method based on rarefaction to predict species losses after random, but aggregated, habitat loss. We compare predictions from three rarefaction models, individual‐based, sample‐based, and spatially clustered, to those derived from a commonly used extinction estimation method, the species–area relationship (SAR). We apply each method to a mesocosm experiment, in which we aim to predict species richness and extinctions of arthropods immediately following 50% habitat loss. While each model produced strikingly accurate predictions of species richness immediately after the habitat loss disturbance, each model significantly underestimated the number of extinctions occurring at both the local (within‐mesocosm) and regional (treatment‐wide) scales. Despite the stochastic nature of our small‐scale, short‐term, and randomly applied habitat loss experiment, we found surprisingly clear evidence for extinction selectivity, for example, when abundant species with low extinction probabilities were extirpated following habitat loss. The important role played by selective extinction even in this contrived experimental system suggests that ecologically driven, trait‐based extinctions play an equally important role to stochastic extinction, even when the disturbance itself has no clear selectivity. As a result, neutrally stochastic null models such as the SAR and rarefaction are likely to underestimate extinctions caused by habitat loss. Nevertheless, given the difficulty of predicting extinctions, null models provide useful benchmarks for conservation planning by providing minimum estimates and probabilities of species extinctions.

    more » « less
  4. Model‐based clustering of time‐evolving networks has emerged as one of the important research topics in statistical network analysis. It is a fundamental research question to model time‐varying network parameters. However, due to difficulties in modelling functional network parameters, there is little progress in the current literature to model time‐varying network parameters effectively. In this work, we model network parameters as univariate nonparametric functions instead of constants. We effectively estimate those functional network parameters in temporal exponential‐family random graph models using a kernel regression technique and a local likelihood approach. Furthermore, we propose a semiparametric finite mixture of temporal exponential‐family random graph models by adopting finite mixture models, which simultaneously allows both modelling and detecting groups in time‐evolving networks. Also, we use a conditional likelihood to construct an effective model selection criterion and network cross‐validation to choose an optimal bandwidth. The power of our method is demonstrated in simulation studies and real‐world applications to dynamic international trade networks and dynamic arm trade networks.

    more » « less
  5. Abstract

    Claiming causal inferences in network settings necessitates careful consideration of the often complex dependency between outcomes for actors. Of particular importance are treatment spillover or outcome interference effects. We consider causal inference when the actors are connected via an underlying network structure. Our key contribution is a model for causality when the underlying network is endogenous; where the ties between actors and the actor covariates are statistically dependent. We develop a joint model for the relational and covariate generating process that avoids restrictive separability and fixed network assumptions, as these rarely hold in realistic social settings. While our framework can be used with general models, we develop the highly expressive class of Exponential-family Random Network models (ERNM) of which Markov random fields and Exponential-family Random Graph models are special cases. We present potential outcome-based inference within a Bayesian framework and propose a modification to the exchange algorithm to allow for sampling from ERNM posteriors. We present results of a simulation study demonstrating the validity of the approach. Finally, we demonstrate the value of the framework in a case study of smoking in the context of adolescent friendship networks.

    more » « less