skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Two-Stage Working Model Strategy for Network Analysis Under Hierarchical Exponential Random Graph Models
Social network data are complex and dependent data. At the macro-level, social networks often exhibit clustering in the sense that social networks consist of communities; and at the micro-level, social networks often exhibit complex network features such as transitivity within communities. Modeling real-world social networks requires modeling both the macro- and micro-level, but many existing models focus on one of them while neglecting the other. In recent work, [28] introduced a class of Exponential Random Graph Models (ERGMs) capturing community structure as well as microlevel features within communities. While attractive, existing approaches to estimating ERGMs with community structure are not scalable. We propose here a scalable two-stage strategy to estimate an important class of ERGMs with community structure, which induces transitivity within communities. At the first stage, we use an approximate model, called working model, to estimate the community structure. At the second stage, we use ERGMs with geometrically weighted dyadwise and edgewise shared partner terms to capture refined forms of transitivity within communities. We use simulations to demonstrate the performance of the two-stage strategy in terms of the estimated community structure. In addition, we show that the estimated ERGMs with geometrically weighted dyadwise and edgewise shared partner terms within communities outperform the working model in terms of goodness-of-fit. Last, but not least, we present an application to high-resolution human contact network data.  more » « less
Award ID(s):
1513644
PAR ID:
10065135
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
Page Range / eLocation ID:
290 to 298
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Multilevel network data provide two important benefits for ERG modeling. First, they facilitate estimation of the decay parameters in geometrically weighted terms for degree and triad distributions. Estimating decay parameters from a single network is challenging, so in practice they are typically fixed rather than estimated. Multilevel network data overcome that challenge by leveraging replication. Second, such data make it possible to assess out-ofsample performance using traditional cross-validation techniques. We demonstrate these benefits by using a multilevel network sample of classroom networks from Poland. We show that estimating the decay parameters improves in-sample performance of the model and that the out-of-sample performance of our best model is strong, suggesting that our findings can be generalized to the population of interest. 
    more » « less
  2. Active learning is broadly shown to improve student outcomes as compared with traditional lecture, but more work must be done to distinguish outcomes between different types of active learning. We collected self-reported student social network data at early and late-semester times in a Peer Instruction classroom. The subsequent networks are modeled using exponential random graph models (ERGMs), which are a family of statistical models used with relational data, like social networks. We discuss preliminary findings using this method for a Peer Instruction class. The best-fit ERGM predicts long "chains" of student edges, such as might arise from students talking along rows in the lecture hall. ERGMs appear to be a promising method for quantifying network topology in active learning classrooms. 
    more » « less
  3. Many social networks contain sensitive relational information. One approach to protect the sensitive relational information while offering flexibility for social network research and analysis is to release synthetic social networks at a pre-specified privacy risk level, given the original observed network. We propose the DP-ERGM procedure that synthesizes networks that satisfy the differential privacy (DP) via the exponential random graph model (EGRM). We apply DP-ERGM to a college student friendship network and compare its original network information preservation in the generated private networks with two other approaches: differentially private DyadWise Randomized Response (DWRR) and Sanitization of the Conditional probability of Edge given Attribute classes (SCEA). The results suggest that DP-EGRM preserves the original information significantly better than DWRR and SCEA in both network statistics and inferences from ERGMs and latent space models. In addition, DP-ERGM satisfies the node DP, a stronger notion of privacy than the edge DP that DWRR and SCEA satisfy. 
    more » « less
  4. Abstract Community structure is a fundamental topological characteristic of optimally organized brain networks. Currently, there is no clear standard or systematic approach for selecting the most appropriate community detection method. Furthermore, the impact of method choice on the accuracy and robustness of estimated communities (and network modularity), as well as method‐dependent relationships between network communities and cognitive and other individual measures, are not well understood. This study analyzed large datasets of real brain networks (estimated from resting‐state fMRI from = 5251 pre/early adolescents in the adolescent brain cognitive development [ABCD] study), and = 5338 synthetic networks with heterogeneous, data‐inspired topologies, with the goal to investigate and compare three classes of community detection methods: (i) modularity maximization‐based (Newman and Louvain), (ii) probabilistic (Bayesian inference within the framework of stochastic block modeling (SBM)), and (iii) geometric (based on graph Ricci flow). Extensive comparisons between methods and their individual accuracy (relative to the ground truth in synthetic networks), and reliability (when applied to multiple fMRI runs from the same brains) suggest that the underlying brain network topology plays a critical role in the accuracy, reliability and agreement of community detection methods. Consistent method (dis)similarities, and their correlations with topological properties, were estimated across fMRI runs. Based on synthetic graphs, most methods performed similarly and had comparable high accuracy only in some topological regimes, specifically those corresponding to developed connectomes with at least quasi‐optimal community organization. In contrast, in densely and/or weakly connected networks with difficult to detect communities, the methods yielded highly dissimilar results, with Bayesian inference within SBM having significantly higher accuracy compared to all others. Associations between method‐specific modularity and demographic, anthropometric, physiological and cognitive parameters showed mostly method invariance but some method dependence as well. Although method sensitivity to different levels of community structure may in part explain method‐dependent associations between modularity estimates and parameters of interest, method dependence also highlights potential issues of reliability and reproducibility. These findings suggest that a probabilistic approach, such as Bayesian inference in the framework of SBM, may provide consistently reliable estimates of community structure across network topologies. In addition, to maximize robustness of biological inferences, identified network communities and their cognitive, behavioral and other correlates should be confirmed with multiple reliable detection methods. 
    more » « less
  5. Decision-making on networks can be explained by both homophily and social influences. While homophily drives the formation of communities with similar characteristics, social influences occur both within and between communities. Social influences can be reasoned through role theory, which indicates that the influences among individuals depending on their roles and the behavior of interest. To operationalize these social science theories, we empirically identify the homophilous communities and use the community structures to capture such “roles”, affecting particular decision-making processes. We propose a generative model named the Stochastic Block influences Model and jointly analyzed both network formation and behavioral influences within and between different empirically-identified communities. To evaluate the performance and demonstrate the interpretability of our method, we study the adoption decisions for a microfinance product in Indian villages. We show that although individuals tend to form links within communities, there are strongly positive and negative social influences between communities, supporting the weak ties theory. Moreover, communities with shared characteristics are associated with positive influences. In contrast, communities that do not overlap are associated with negative influences. Our framework facilitates the quantification of the influences underlying decision communities and is thus a helpful tool for driving information diffusion, viral marketing, and technology adoption. 
    more » « less