Abstract Methods for detecting community structure in networks typically aim to identify a single best partition of network nodes into communities, often by optimizing some objective function, but in real-world applications there may be many competitive partitions with objective scores close to the global optimum and one can obtain a more informative picture of the community structure by examining a representative set of such high-scoring partitions than by looking at just the single optimum. However, such a set can be difficult to interpret since its size can easily run to hundreds or thousands of partitions. In this paper we present a method for analyzing large partition sets by dividing them into groups of similar partitions and then identifying an archetypal partition as a representative of each group. The resulting set of archetypal partitions provides a succinct, interpretable summary of the form and variety of community structure in any network. We demonstrate the method on a range of example networks.
more »
« less
Asymptotics of pure dimer coverings on rail yard graphs
Abstract We study the asymptotic limit of random pure dimer coverings on rail yard graphs when the mesh sizes of the graphs go to 0. Each pure dimer covering corresponds to a sequence of interlacing partitions starting with an empty partition and ending in an empty partition. Under the assumption that the probability of each dimer covering is proportional to the product of weights of present edges, we obtain the limit shape (law of large numbers) of the rescaled height functions and the convergence of the unrescaled height fluctuations to a diffeomorphic image of the Gaussian free field (Central Limit Theorem), answering a question in [7]. Applications include the limit shape and height fluctuations for pure steep tilings [9] and pyramid partitions [20; 36; 39; 38]. The technique to obtain these results is to analyze a class of Macdonald processes which involve dual partitions as well.
more »
« less
- Award ID(s):
- 1928930
- PAR ID:
- 10529216
- Publisher / Repository:
- Cambridge University Press
- Date Published:
- Journal Name:
- Forum of Mathematics, Sigma
- Volume:
- 11
- ISSN:
- 2050-5094
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A divide-and-conquer (DAC) machine learning approach was first proposed by Wang et al. to forecast the sea surface height (SSH) of the Loop Current System (LCS) in the Gulf of Mexico. In this DAC approach, the forecast domain was divided into non-overlapping partitions, each of which had their own prediction model. The full domain SSH prediction was recovered by interpolating the SSH across each partition boundaries. Although the original DAC model was able to predict the LCS evolution and eddy shedding more than two months and three months in advance, respectively, growing errors at the partition boundaries negatively affected the model forecasting skills. In the study herein, a new partitioning method, which consists of overlapping partitions is presented. The region of interest is divided into 50%-overlapping partitions. At each prediction step, the SSH value at each point is computed from overlapping partitions, which significantly reduces the occurrence of unrealistic SSH features at partition boundaries. This new approach led to a significant improvement of the overall model performance both in terms of features prediction such as the location of the LC eddy SSH contours but also in terms of event prediction, such as the LC ring separation. We observed an approximate 12% decrease in error over a 10-week prediction, and also show that this method can approximate the location and shedding of eddy Cameron better than the original DAC method.more » « less
-
We prove that a polynomial fraction of the set of $$k$$-component forests in the $$m \times n$$ grid graph have equal numbers of vertices in each component, for any constant $$k$$. This resolves a conjecture of Charikar, Liu, Liu, and Vuong, and establishes the first provably polynomial-time algorithm for (exactly or approximately) sampling balanced grid graph partitions according to the spanning tree distribution, which weights each $$k$$-partition according to the product, across its $$k$$ pieces, of the number of spanning trees of each piece. Our result follows from a careful analysis of the probability a uniformly random spanning tree of the grid can be cut into balanced pieces. Beyond grids, we show that for a broad family of lattice-like graphs, we achieve balance up to any multiplicative $$(1 \pm \varepsilon)$$ constant with constant probability. More generally, we show that, with constant probability, components derived from uniform spanning trees can approximate any given partition of a planar region specified by Jordan curves. This implies polynomial-time algorithms for sampling approximately balanced tree-weighted partitions for lattice-like graphs. Our results have applications to understanding political districtings, where there is an underlying graph of indivisible geographic units that must be partitioned into $$k$$ population-balanced connected subgraphs. In this setting, tree-weighted partitions have interesting geometric properties, and this has stimulated significant effort to develop methods to sample them.more » « less
-
null (Ed.)The physical data layout significantly impacts performance when database systems access cold data. In addition to the traditional row store and column store designs, recent research proposes to partition tables hierarchically, starting from either horizontal or vertical partitions and then determining the best partitioning strategy on the other dimension independently for each partition. All these partitioning strategies naturally produce rectangular partitions. Coarse-grained rectangular partitioning reads unnecessary data when a table cannot be partitioned along one dimension for all queries. Fine-grained rectangular partitioning produces many small partitions which negatively impacts I/O performance and possibly introduces a high tuple reconstruction overhead. This paper introduces Jigsaw, a system that employs a novel partitioning strategy that creates partitions with arbitrary shapes, which we refer to as irregular partitions. The traditional tuple-at-a-time or operator-at-a-time query processing models cannot fully leverage the advantages of irregular partitioning, because they may repeatedly read a partition due to its irregular shape. Jigsaw introduces a partition-at-a-time evaluation strategy to avoid repeated accesses to an irregular partition. We implement and evaluate Jigsaw on the HAP and TPC-H benchmarks and find that irregular partitioning is up to 4.2× faster than a columnar layout for moderately selective queries. Compared with the columnar layout, irregular partitioning only transfers 21% of the data to complete the same query.more » « less
-
Let $$G$$ be a graph with vertex set $$\{1,2,\ldots,n\}$$. Its bond lattice, $BL(G)$, is a sublattice of the set partition lattice. The elements of $BL(G)$ are the set partitions whose blocks induce connected subgraphs of $$G$$. In this article, we consider graphs $$G$$ whose bond lattice consists only of noncrossing partitions. We define a family of graphs, called triangulation graphs, with this property and show that any two produce isomorphic bond lattices. We then look at the enumeration of the maximal chains in the bond lattices of triangulation graphs. Stanley's map from maximal chains in the noncrossing partition lattice to parking functions was our motivation. We find the restriction of his map to the bond lattice of certain subgraphs of triangulation graphs. Finally, we show the number of maximal chains in the bond lattice of a triangulation graph is the number of ordered cycle decompositions.more » « less
An official website of the United States government

