skip to main content


Title: Preferential attachment with reciprocity: properties and estimation
Abstract

Reciprocity in social networks is a measure of information exchange between two individuals, and indicates interaction patterns between pairs of users. A recent study finds that the reciprocity coefficient of a classical directed preferential attachment (PA) model does not match empirical evidence. Towards remedying this deficiency, we extend the classical three-scenario directed PA model by adding a parameter that controls the probability of creating a reciprocal edge. This proposed model also allows edge creation between two existing nodes, making it a realistic candidate for fitting to datasets. We provide and compare two estimation procedures for fitting the new reciprocity model and demonstrate the methods on simulated and real datasets. One estimation method requires careful analysis of the heavy tail properties of the model. The fitted models provide a good match with the empirical tail distributions of both in- and out-degrees but other mismatched diagnostics suggest that further generalization of the model is warranted.

 
more » « less
NSF-PAR ID:
10456548
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of Complex Networks
Volume:
11
Issue:
5
ISSN:
2051-1329
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Graph matching algorithms attempt to find the best correspondence between the nodes of two networks. These techniques have been used to match individual neurons in nanoscale connectomes—in particular, to find pairings of neurons across hemispheres. However, since graph matching techniques deal with two isolated networks, they have only utilized the ipsilateral (same hemisphere) subgraphs when performing the matching. Here, we present a modification to a state-of-the-art graph matching algorithm that allows it to solve what we call the bisected graph matching problem. This modification allows us to leverage the connections between the brain hemispheres when predicting neuron pairs. Via simulations and experiments on real connectome datasets, we show that this approach improves matching accuracy when sufficient edge correlation is present between the contralateral (between hemisphere) subgraphs. We also show how matching accuracy can be further improved by combining our approach with previously proposed extensions to graph matching, which utilize edge types and previously known neuron pairings. We expect that our proposed method will improve future endeavors to accurately match neurons across hemispheres in connectomes, and be useful in other applications where the bisected graph matching problem arises.

     
    more » « less
  2. Abstract

    Graph matching algorithms attempt to find the best correspondence between the nodes of two networks. These techniques have been used to match individual neurons in nanoscale connectomes – in particular, to find pairings of neurons across hemispheres. However, since graph matching techniques deal with two isolated networks, they have only utilized the ipsilateral (same hemisphere) subgraphs when performing the matching. Here, we present a modification to a state-of-the-art graph matching algorithm which allows it to solve what we call the bisected graph matching problem. This modification allows us to leverage the connections between the brain hemispheres when predicting neuron pairs. Via simulations and experiments on real connectome datasets, we show that this approach improves matching accuracy when sufficient edge correlation is present between the contralateral (between hemisphere) subgraphs. We also show how matching accuracy can be further improved by combining our approach with previously proposed extensions to graph matching, which utilize edge types and previously known neuron pairings. We expect that our proposed method will improve future endeavors to accurately match neurons across hemispheres in connectomes, and be useful in other applications where the bisected graph matching problem arises.

     
    more » « less
  3. Abstract

    Preferential attachment, homophily, and their consequences such as scale-free (i.e. power-law) degree distributions, the glass ceiling effect (the unseen, yet unbreakable barrier that keeps minorities and women from rising to the upper rungs of the corporate ladder, regardless of their qualifications or achievements) and perception bias are well-studied in undirected networks. However, such consequences and the factors that lead to their emergence in directed networks (e.g. author–citation graphs, Twitter) are yet to be coherently explained in an intuitive, theoretically tractable manner using a single dynamical model. To this end, we present a theoretical and numerical analysis of the novel Directed Mixed Preferential Attachment model in order to explain the emergence of scale-free degree distributions and the glass ceiling effect in directed networks with two groups (minority and majority). Specifically, we first derive closed-form expressions for the power-law exponents of the in-degree and out-degree distributions of each of the two groups and then compare the derived exponents with each other to obtain useful insights. These insights include answers to questions such as: when does the minority group have an out-degree (or in-degree) distribution with a heavier tail compared to the majority group? what factors cause the tail of the out-degree distribution of a group to be heavier than the tail of its own in-degree distribution? what effect does frequent addition of edges between existing nodes have on the in-degree and out-degree distributions of the majority and minority groups? Answers to these questions shed light on the interplay between structure (i.e. the in-degree and out-degree distributions of the two groups) and dynamics (characterized collectively by the homophily, preferential attachment, group sizes and growth dynamics) of various real-world directed networks. We also provide a novel definition of the glass ceiling faced by a group via the number of individuals with large out-degree (i.e. those with many followers) normalized by the number of individuals with large in-degree (i.e. those who follow many others) and then use it to characterize the conditions that cause the glass ceiling effect to emerge in a directed network. Our analytical results are supported by detailed numerical experiments. The DMPA model and its theoretical and numerical analysis provided in this article are useful for analysing various phenomena on directed networks in fields such as network science and computational social science.

     
    more » « less
  4. In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors. The first problem is approximating cuts in balanced directed graphs, where the goal is to build a data structure to provide a $(1 \pm \epsilon)$-estimation of the cut values of a graph on $n$ vertices. For this problem, there are tight bounds for undirected graphs, but for directed graphs, such a data structure requires $\Omega(n^2)$ bits even for constant $\epsilon$. To cope with this, recent works consider $\beta$-balanced graphs, meaning that for every directed cut, the total weight of edges in one direction is at most $\beta$ times the total weight in the other direction. We consider the for-each model, where the goal is to approximate a fixed cut with high probability, and the for-all model, where the data structure must simultaneously preserve all cuts. We improve the previous $\Omega(n \sqrt{\beta/\epsilon})$ lower bound in the for-each model to $\tilde\Omega(n \sqrt{\beta}/\epsilon)$ and we improve the previous $\Omega(n \beta/\epsilon)$ lower bound in the for-all model to $\Omega(n \beta/\epsilon^2)$. This resolves the main open questions of (Cen et al., ICALP, 2021). The second problem is approximating the global minimum cut in the local query model where we can only access the graph through degree, edge, and adjacency queries. We prove an $\Omega(\min\{m, \frac{m}{\epsilon^2 k}\})$ lower bound for this problem, which improves the previous $\Omega(\frac{m}{k})$ lower bound, where $m$ is the number of edges of the graph, $k$ is the minimum cut size, and we seek a $(1+\epsilon)$-approximation. In addition, we observe that existing upper bounds with minor modifications match our lower bound up to logarithmic factors. 
    more » « less
  5. Abstract

    Motivated by the widespread use of large gridded data sets in the atmospheric sciences, we propose a new model for extremes of areal data that is inspired by the simultaneous autoregressive (SAR) model in classical spatial statistics. Our extreme SAR model extends recent work on transformed‐linear operations applied to regularly varying random vectors, and is unique among extremes models in being directly analogous to a classical linear model. An additional appeal is its simplicity; given a proximity matrixW, spatial dependence is described by a single parameter . We develop an estimation method that minimizes the discrepancy between the tail pairwise dependence matrix (TPDM) for the fitted model and the estimated TPDM. Applying this method to simulated data demonstrates that it is able to produce good estimates of extremal spatial dependence even in the case of model misspecification, and additionally produces reasonable estimates of uncertainty. We also apply the method to gridded precipitation observations for a study region over northeast Colorado, and find that a single‐parameter extreme SAR model paired with a neighborhood structure which accounts for longer range dependence effectively models spatial dependence in these data.

     
    more » « less