skip to main content

Title: Comparing alternatives to the fixed degree sequence model for extracting the backbone of bipartite projections
Abstract Projections of bipartite or two-mode networks capture co-occurrences, and are used in diverse fields (e.g., ecology, economics, bibliometrics, politics) to represent unipartite networks. A key challenge in analyzing such networks is determining whether an observed number of co-occurrences between two nodes is significant, and therefore whether an edge exists between them. One approach, the fixed degree sequence model (FDSM), evaluates the significance of an edge’s weight by comparison to a null model in which the degree sequences of the original bipartite network are fixed. Although the FDSM is an intuitive null model, it is computationally expensive because it requires Monte Carlo simulation to estimate each edge’s p value, and therefore is impractical for large projections. In this paper, we explore four potential alternatives to FDSM: fixed fill model, fixed row model, fixed column model, and stochastic degree sequence model (SDSM). We compare these models to FDSM in terms of accuracy, speed, statistical power, similarity, and ability to recover known communities. We find that the computationally-fast SDSM offers a statistically conservative but close approximation of the computationally-impractical FDSM under a wide range of conditions, and that it correctly recovers a known community structure even when the signal is weak. Therefore, although each backbone model may have particular applications, we recommend SDSM for extracting the backbone of bipartite projections when FDSM is impractical.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Scientific Reports
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Many applications require randomly sampling bipartite graphs with fixed degrees or randomly sampling incidence matrices with fixed row and column sums. Although several sampling algorithms exist, the ‘curveball’ algorithm is the most efficient with an asymptotic time complexity of $O(n~log~n)$ and has been proven to sample uniformly at random. In this article, we introduce the ‘fastball’ algorithm, which adopts a similar approach but has an asymptotic time complexity of $O(n)$. We show that a C$\texttt{++}$ implementation of fastball randomly samples large bipartite graphs with fixed degrees faster than curveball, and illustrate the value of this faster algorithm in the context of the fixed degree sequence model for backbone extraction.

    more » « less
  2. This study examined how humans spontaneously merge a sequence of discrete actions into a rhythmic pattern, even when periodicity is not required. Two experiments used a virtual throwing task, in which subjects performed a long sequence of discrete throwing movements, aiming to hit a virtual target. In experiment 1, subjects performed the task for 11 sessions. Although there was no instruction to perform rhythmically, the variability of the interthrow intervals decreased to a level comparable to that of synchronizing with a metronome; furthermore, dwell times shortened or even disappeared with practice. Floquet multipliers and decreasing variability of the arm trajectories estimated in state space indicated an increasing degree of dynamic stability. Subjects who achieved a higher level of periodicity and stability also displayed higher accuracy in the throwing task. To directly test whether rhythmicity affected performance, experiment 2 disrupted the evolving continuity and periodicity by enforcing a pause between successive throws. This discrete group performed significantly worse and with higher variability in their arm trajectories than the self-paced group. These findings are discussed in the context of previous neuroimaging results showing that rhythmic movements involve significantly fewer cortical and subcortical activations than discrete movements and therefore may pose a computationally more parsimonious solution. Such emerging stable rhythms in neuromotor subsystems may serve as building blocks or dynamic primitives for complex actions. The tendency for humans to spontaneously fall into a rhythm in voluntary movements is consistent with the ubiquity of rhythms at all levels of the physiological system. NEW & NOTEWORTHY When performing a series of throws to hit a target, humans spontaneously merged successive actions into a continuous approximately periodic pattern. The degree of rhythmicity and stability correlated with hitting accuracy. Enforcing irregular pauses between throws to disrupt the rhythm deteriorated performance. Stable rhythmic patterns may simplify control of movement and serve as dynamic primitives for more complex actions. This observation reveals that biological systems tend to exhibit rhythmic behavior consistent with a plethora of physiological processes. 
    more » « less
  3. Cherifi, Hocine (Ed.)
    Networks are useful for representing phenomena in a broad range of domains. Although their ability to represent complexity can be a virtue, it is sometimes useful to focus on a simplified network that contains only the most important edges: the backbone. This paper introduces and demonstrates a substantially expanded version of the backbone package for R, which now provides methods for extracting backbones from weighted networks, weighted bipartite projections, and unweighted networks. For each type of network, fully replicable code is presented first for small toy examples, then for complete empirical examples using transportation, political, and social networks. The paper also demonstrates the implications of several issues of statistical inference that arise in backbone extraction. It concludes by briefly reviewing existing applications of backbone extraction using the backbone package, and future directions for research on network backbone extraction. 
    more » « less
  4. Online bipartite matching and allocation models are widely used to analyze and design markets such as Internet advertising, online labor, and crowdsourcing. Traditionally, vertices on one side of the market are fixed and known a priori, while vertices on the other side arrive online and are matched by a central agent to the offline side. The issue of possible conflicts among offline agents emerges in various real scenarios when we need to match each online agent with a set of offline agents. For example, in event-based social networks (e.g., Meetup), offline events conflict for some users since they will be unable to attend mutually-distant events at proximate times; in advertising markets, two competing firms may prefer not to be shown to one user simultaneously; and in online recommendation systems (e.g., Amazon Books), books of the same type “conflict” with each other in some sense due to the diversity requirement for each online buyer. The conflict nature inherent among certain offline agents raises significant challenges in both modeling and online algorithm design. In this paper, we propose a unifying model, generalizing the conflict models proposed in (She et al., TKDE 2016) and (Chen et al., TKDE 16). Our model can capture not only a broad class of conflict constraints on the offline side (which is even allowed to be sensitive to each online agent), but also allows a general arrival pattern for the online side (which is allowed to change over the online phase). We propose an efficient linear programming (LP) based online algorithm and prove theoretically that it has nearly-optimal online performance. Additionally, we propose two LP-based heuristics and test them against two natural baselines on both real and synthetic datasets. Our LP-based heuristics experimentally dominate the baseline algorithms, aligning with our theoretical predictions and supporting our unified approach. 
    more » « less
  5. Abstract

    Specialized associations between interacting species fundamentally determine the diversity and distribution of both partners. How the specialization of guilds of organisms varies along environmental gradients underpins popular theories of biogeography and macroecology, whereas the degree of specialization of a species is typically considered fixed. However, the extent to which environmental context impacts specialization dynamics is seldom examined empirically. In this study, we examine how specialization within a bipartite network consisting of three co-occurring plant species and their foliar fungal endophyte symbionts changes along a 1000-meter elevation gradient where host species were held constant. The gradient, along the slope of Mauna Loa shield volcano, represents almost the entire elevational range of two of the three plants. Network and plant specialization values displayed a parabolic relationship with elevation, and were highest at middle elevations, whereas bipartite associations were least specific at low and high elevations. Shannon’s diversity of fungal endophytes correlated negatively with specificity, and was highest at the ends of the transects. Although plant host was a strong determinant of fungal community composition within sites, fungal species turnover was high among sites. There was no evidence of spatial or elevational patterning in fungal community compositon. Our work demonstrates that specificity can be a plastic trait, which is influenced by the environment and centrality of the host within its natural range.

    more » « less