Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The number of non-negative integer matrices with given row and column sums features in a variety of problems in mathematics and statistics but no closed-form expression for it is known, so we rely on approximations. In this paper, we describe a new such approximation, motivated by consideration of the statistics of matrices with non-integer numbers of columns. This estimate can be evaluated in time linear in the size of the matrix and returns results of accuracy as good as or better than existing linear-time approximations across a wide range of settings. We show that the estimate is asymptotically exact in the regime of sparse tables, while empirically performing at least as well as other linear-time estimates in the regime of dense tables. We also use the new estimate as the starting point for an improved numerical method for either counting or sampling matrices with given margins using sequential importance sampling. Code implementing our methods is available.more » « lessFree, publicly-accessible full text available January 1, 2025
-
We study the ranking of individuals, teams, or objects, based on pairwise comparisons between them, using the Bradley-Terry model. Estimates of rankings within this model are commonly made using a simple iterative algorithm first introduced by Zermelo almost a century ago. Here we describe an alternative and similarly simple iteration that provably returns identical results but does so much faster -- over a hundred times faster in some cases. We demonstrate this algorithm with applications to a range of example data sets and derive a number of results regarding its convergence.more » « less
-
We study core-periphery structure in networks using inference methods based on a flexible network model that allows for traditional onion-like cores within cores, but also for hierarchical tree-like structures and more general non-nested types of structure. We propose an efficient Monte Carlo scheme for fitting the model to observed networks and report results for a selection of real-world data sets. Among other things, we observe an empirical distinction between networks showing traditional core-periphery structure with a dense core weakly connected to a sparse periphery, and an alternative structure in which the core is strongly connected both within itself and to the periphery. Networks vary in whether they are better represented by one type of structure or the other. We also observe structures that are a hybrid between core-periphery structure and community structure, in which networks have a set of non-overlapping cores that correspond roughly to communities, surrounded by a single undifferentiated periphery. Computer code implementing our methods is available.more » « less
-
Networks and network computations have become a primary mathematical tool for analyzing the structure of many kinds of complex systems, ranging from the Internet and transportation networks to biochemical interactions and social networks. A common task in network analysis is the calculation of quantities that reside on the nodes of a network, such as centrality measures, probabilities or model states. In this perspective article we discuss message passing methods, a family of techniques for performing such calculations, based on the propagation of information between the nodes of a network. We introduce the message passing approach with a series of examples, give some illustrative applications and results and discuss the deep connections between message passing and phase transitions in networks. We also point out some limitations of the message passing approach and describe some recently introduced methods that address these limitations.more » « less
-
Abstract Methods for detecting community structure in networks typically aim to identify a single best partition of network nodes into communities, often by optimizing some objective function, but in real-world applications there may be many competitive partitions with objective scores close to the global optimum and one can obtain a more informative picture of the community structure by examining a representative set of such high-scoring partitions than by looking at just the single optimum. However, such a set can be difficult to interpret since its size can easily run to hundreds or thousands of partitions. In this paper we present a method for analyzing large partition sets by dividing them into groups of similar partitions and then identifying an archetypal partition as a representative of each group. The resulting set of archetypal partitions provides a succinct, interpretable summary of the form and variety of community structure in any network. We demonstrate the method on a range of example networks.more » « less
-
The task of ranking individuals or teams, based on a set of comparisons between pairs, arises in various contexts, including sporting competitions and the analysis of dominance hierarchies among animals and humans. Given data on which competitors beat which others, the challenge is to rank the competitors from best to worst. Here we study the problem of computing rankings when there are multiple, potentially conflicting types of comparison, such as multiple types of dominance behaviours among animals. We assume that we do not know a priori what information each behaviour conveys about the ranking, or even whether they convey any information at all. Nonetheless, we show that it is possible to compute a ranking in this situation and present a fast method for doing so, based on a combination of an expectation–maximization algorithm and a modified Bradley–Terry model. We give a selection of example applications to both animal and human competition.more » « less
-
The Border Gateway Protocol (BGP) is a distributed protocol that manages interdomain routing without requiring a centralized record of which autonomous systems (ASes) connect to which others. Many methods have been devised to infer the AS topology from publicly available BGP data, but none provide a general way to handle the fact that the data are notoriously incomplete and subject to error. This paper describes a method for reliably inferring AS-level connectivity in the presence of measurement error using Bayesian statistical inference acting on BGP routing tables from multiple vantage points. We employ a novel approach for counting AS adjacency observations in the AS-PATH attribute data from public route collectors, along with a Bayesian algorithm to generate a statistical estimate of the AS-level network. Our approach also gives us a way to evaluate the accuracy of existing reconstruction methods and to identify advantageous locations for new route collectors or vantage points.more » « less
-
null (Ed.)Abstract Empirical measurements of ecological networks such as food webs and mutualistic networks are often rich in structure but also noisy and error-prone, particularly for rare species for which observations are sparse. Focusing on the case of plant–pollinator networks, we here describe a Bayesian statistical technique that allows us to make accurate estimates of network structure and ecological metrics from such noisy observational data. Our method yields not only estimates of these quantities, but also estimates of their statistical errors, paving the way for principled statistical analyses of ecological variables and outcomes. We demonstrate the use of the method with an application to previously published data on plant–pollinator networks in the Seychelles archipelago and Kosciusko National Park, calculating estimates of network structure, network nestedness, and other characteristics.more » « less
-
Estrada, Ernesto (Ed.)Abstract The friendship paradox is the observation that the degrees of the neighbours of a node in any network will, on average, be greater than the degree of the node itself. In common parlance, your friends have more friends than you do. In this article, we develop the mathematical theory of the friendship paradox, both in general as well as for specific model networks, focusing not only on average behaviour but also on variation about the average and using generating function methods to calculate full distributions of quantities of interest. We compare the predictions of our theory with measurements on a large number of real-world network datasets and find remarkably good agreement. We also develop equivalent theory for the generalized friendship paradox, which compares characteristics of nodes other than degree to those of their neighbours.more » « less