skip to main content

Title: Scale-free degree distributions, homophily and the glass ceiling effect in directed networks

Preferential attachment, homophily, and their consequences such as scale-free (i.e. power-law) degree distributions, the glass ceiling effect (the unseen, yet unbreakable barrier that keeps minorities and women from rising to the upper rungs of the corporate ladder, regardless of their qualifications or achievements) and perception bias are well-studied in undirected networks. However, such consequences and the factors that lead to their emergence in directed networks (e.g. author–citation graphs, Twitter) are yet to be coherently explained in an intuitive, theoretically tractable manner using a single dynamical model. To this end, we present a theoretical and numerical analysis of the novel Directed Mixed Preferential Attachment model in order to explain the emergence of scale-free degree distributions and the glass ceiling effect in directed networks with two groups (minority and majority). Specifically, we first derive closed-form expressions for the power-law exponents of the in-degree and out-degree distributions of each of the two groups and then compare the derived exponents with each other to obtain useful insights. These insights include answers to questions such as: when does the minority group have an out-degree (or in-degree) distribution with a heavier tail compared to the majority group? what factors cause the tail of the out-degree distribution of a group to be heavier than the tail of its own in-degree distribution? what effect does frequent addition of edges between existing nodes have on the in-degree and out-degree distributions of the majority and minority groups? Answers to these questions shed light on the interplay between structure (i.e. the in-degree and out-degree distributions of the two groups) and dynamics (characterized collectively by the homophily, preferential attachment, group sizes and growth dynamics) of various real-world directed networks. We also provide a novel definition of the glass ceiling faced by a group via the number of individuals with large out-degree (i.e. those with many followers) normalized by the number of individuals with large in-degree (i.e. those who follow many others) and then use it to characterize the conditions that cause the glass ceiling effect to emerge in a directed network. Our analytical results are supported by detailed numerical experiments. The DMPA model and its theoretical and numerical analysis provided in this article are useful for analysing various phenomena on directed networks in fields such as network science and computational social science.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of Complex Networks
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We investigate the three-state majority-vote model for opinion dynamics on scale-free and regular networks. In this model, an individual selects an opinion equal to the opinion of the majority of its neighbors with probability 1 − q, and different to it with probabilityq. The parameterqis called the noise parameter of the model. We build a network of interactions wherezneighbors are selected by each added site in the system, a preferential attachment network with degree distributionkλ, whereλ = 3 for a large number of nodesN. In this work,zis called the growth parameter. Using finite-size scaling analysis, we obtain that the critical exponents$$\beta /\bar{\nu }$$β/ν¯and$$\gamma /\bar{\nu }$$γ/ν¯associated with the magnetization and the susceptibility, respectively. Using Monte Carlo simulations, we calculate the critical noise parameterqcas a function ofzfor the scale-free networks and obtain the phase diagram of the model. We find that the critical exponents add up to unity when using a special volumetric scaling, regardless of the dimension of the network of interactions. We verify this result by obtaining the critical noise and the critical exponents for the two and three-state majority-vote model on cubic lattice networks.

    more » « less
  2. Graph neural networks (GNNs) have emerged as a powerful tool for modeling graph data due to their ability to learn a concise representation of the data by integrating the node attributes and link information in a principled fashion. However, despite their promise, there are several practical challenges that must be overcome to effectively use them for node classification problems. In particular, current approaches are vulnerable to different kinds of biases inherent in the graph data. First, if the class distribution is imbalanced, then the GNNs' loss function is biased towards classifying the majority class correctly rather than the minority class, which hurts the performance of the latter class. Second, due to homophily effect, the learned representation and subsequent downstream tasks may favor certain demographic groups over others when applied to social network data. To mitigate such biases, we propose a novel framework called Fairness-Aware Cost Sensitive Graph Convolutional Network (FACS-GCN) for classifying nodes in networks with skewed class distributions. Our approach combines a cost-sensitive exponential loss with an adversarial learning component to alleviate the ill-effects of both biases. The framework employs a stagewise additive modeling approach to ensure there is no significant loss in accuracy when imparting fairness into the GNN. Experimental results on 6 benchmark graph data demonstrate the effectiveness of FACS-GCN against comparable baseline methods in terms of promoting fairness while maintaining a high model accuracy on the majority of the datasets. 
    more » « less
  3. Abstract

    Reciprocity in social networks is a measure of information exchange between two individuals, and indicates interaction patterns between pairs of users. A recent study finds that the reciprocity coefficient of a classical directed preferential attachment (PA) model does not match empirical evidence. Towards remedying this deficiency, we extend the classical three-scenario directed PA model by adding a parameter that controls the probability of creating a reciprocal edge. This proposed model also allows edge creation between two existing nodes, making it a realistic candidate for fitting to datasets. We provide and compare two estimation procedures for fitting the new reciprocity model and demonstrate the methods on simulated and real datasets. One estimation method requires careful analysis of the heavy tail properties of the model. The fitted models provide a good match with the empirical tail distributions of both in- and out-degrees but other mismatched diagnostics suggest that further generalization of the model is warranted.

    more » « less
  4. Estrada, Ernesto (Ed.)
    Abstract Preferential attachment (PA) models are a common class of graph models which have been used to explain why power-law distributions appear in the degree sequences of real network data. Among other properties of real-world networks, they commonly have non-trivial clustering coefficients due to an abundance of triangles as well as power laws in the eigenvalue spectra. Although there are triangle PA models and eigenvalue power laws in specific PA constructions, there are no results that existing constructions have both. In this article, we present a specific Triangle Generalized Preferential Attachment Model that, by construction, has non-trivial clustering. We further prove that this model has a power law in both the degree distribution and eigenvalue spectra. 
    more » « less
  5. Abstract We investigate the statistical learning of nodal attribute functionals in homophily networks using random walks. Attributes can be discrete or continuous. A generalization of various existing canonical models, based on preferential attachment is studied (model class $$\mathscr {P}$$ P ), where new nodes form connections dependent on both their attribute values and popularity as measured by degree. An associated model class $$\mathscr {U}$$ U is described, which is amenable to theoretical analysis and gives access to asymptotics of a host of functionals of interest. Settings where asymptotics for model class $$\mathscr {U}$$ U transfer over to model class $$\mathscr {P}$$ P through the phenomenon of resolvability are analyzed. For the statistical learning, we consider several canonical attribute agnostic sampling schemes such as Metropolis-Hasting random walk, versions of node2vec (Grover and Leskovec, 2016) that incorporate both classical random walk and non-backtracking propensities and propose new variants which use attribute information in addition to topological information to explore the network. Estimators for learning the attribute distribution, degree distribution for an attribute type and homophily measures are proposed. The performance of such statistical learning framework is studied on both synthetic networks (model class $$\mathscr {P}$$ P ) and real world systems, and its dependence on the network topology, degree of homophily or absence thereof, (un)balanced attributes, is assessed. 
    more » « less