Balanced centroidal power diagrams for redistricting
We consider the problem of political redistricting: given the locations of people in a geographical area (e.g. a US state), the goal is to decompose the area into subareas, called districts, so that the populations of the districts are as close as possible and the districts are compact'' and contiguous,'' to use the terms referred to in most US state constitutions and/or US Supreme Court rulings. We study a method that outputs a solution in which each district is the intersection of a convex polygon with the geographical area. The average number of sides per polygon is less than six. The polygons tend to be quite compact. Every two districts differ in population by at most one (so we call the solution balanced). In fact, the solution is a centroidal power diagram: each polygon has an associated center in ℝ³ such that * the projection of the center onto the plane z = 0 is the centroid of the locations of people assigned to the polygon, and * for each person assigned to that polygon, the polygon's center is closest among all centers. The polygons are convex because they are the intersections of 3D Voronoi cells with the plane. The solution is, in a well-defined sense, a locally optimal solution to the problem of choosing centers in the plane and choosing an assignment of people to those 2-d centers so as to minimize the sum of squared distances subject to the assignment being balanced. * A practical problem with this approach is that, in real-world redistricting, exact locations of people are unknown. Instead, the input consists of polygons (census blocks) and associated populations. A real redistricting must not split census blocks. We therefore propose a second phase that perturbs the solution slightly so it does not split census blocks. In our experiments, the second phase achieves this while preserving perfect population balance. The district polygons are no longer convex at the fine scale because their boundaries must follow the boundaries of census blocks, but at a coarse scale they preserve the shape of the original polygons.  more » « less
Award ID(s):
NSF-PAR ID:
10128778
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Page Range / eLocation ID:
389 to 396
Format(s):
Medium: X
National Science Foundation
##### More Like this
1. ; ; ; (Ed.)
In the United States, regions (such as states or counties) are frequently divided into districts for the purpose of electing representatives. How the districts are drawn can have a profound effect on who's elected, and drawing the districts to give an advantage to a certain group is known as gerrymandering. It can be surprisingly difficult to detect when gerrymandering is occurring, but one algorithmic method is to compare a current districting plan to a large number of randomly sampled plans to see whether it is an outlier. Recombination Markov chains are often used to do this random sampling: randomly choose two districts, consider their union, and split this union up in a new way. This approach works well in practice and has been widely used, including in litigation, but the theory behind it remains underdeveloped. For example, it's not known if recombination Markov chains are irreducible, that is, if recombination moves suffice to move from any districting plan to any other. Irreducibility of recombination Markov chains can be formulated as a graph problem: for a planar graph G, is the space of all partitions of G into κ connected subgraphs (κ districts) connected by recombination moves? While the answer is yes when districts can be as small as one vertex, this is not realistic in real-world settings where districts must have approximately balanced populations. Here we fix district sizes to be κ1 ± 1 vertices, κ2 ± 1 vertices,… for fixed κ1, κ2,…, a more realistic setting. We prove for arbitrarily large triangular regions in the triangular lattice, when there are three simply connected districts, recombination Markov chains are irreducible. This is the first proof of irreducibility under tight district size constraints for recombination Markov chains beyond small or trivial examples. The triangular lattice is the most natural setting in which to first consider such a question, as graphs representing states/regions are frequently triangulated. The proof uses a sweep-line argument, and there is hope it will generalize to more districts, triangulations satisfying mild additional conditions, and other redistricting Markov chains.
more » « less
2. (Ed.)
It is commonly believed that, in congressional and state legislature elections in the United States, rural voters have an inherent political advantage over urban voters. We study this hypothesis using an idealized redistricting method, balanced centroidal power diagrams, that achieves essentially perfect population balance while optimizing a principled measure of compactness. We find that, using this method, the degree to which rural or urban voters have a political advantage depends on the number of districts and the population density of urban areas. Moreover, we find that the political advantage in any case tends to be dramatically less than that afforded by district plans used in the real world, including district plans drawn by presumably neutral parties such as the courts. One possible explanation is suggested by the following discovery: modifying centroidal power diagrams to prefer placing boundaries along city boundaries significantly increases the advantage rural voters have over urban voters.
more » « less
3. Data are available for download at http://arcticdata.io/data/10.18739/A2KW57K57 Permafrost can be indirectly detected via remote sensing techniques through the presence of ice-wedge polygons, which are a ubiquitous ground surface feature in tundra regions. Ice-wedge polygons form through repeated annual cracking of the ground during cold winter days. In spring, the cracks fill in with snowmelt water, creating ice wedges, which are connected across the landscape in an underground network and that can grow to several meters depth and width. The growing ice wedges push the soil upwards, forming ridges that bound low-centered ice-wedge polygons. If the top of the ice wedge melts, the ground subsides and the ridges become troughs and the ice-wedge polygons become high-centered. Here, a Convolutional Neural Network is used to map the boundaries of individual ice-wedge polygons based on high-resolution commercial satellite imagery obtained from the Polar Geospatial Center. This satellite imagery used for the detection of ice-wedge polygons represent years between 2001 and 2021, so this dataset represents ice-wedge polygons mapped from different years. This dataset does not include a time series (i.e. same area mapped more than once). The shapefiles are masked, reprojected, and processed into GeoPackages with calculated attributes for each ice-wedge polygon such as circumference and width. The GeoPackages are then rasterized with new calculated attributes for ice-wedge polygon coverage such a coverage density. This release represents the region classified as “high ice” by Brown et al. 1997. The dataset is available to explore on the Permafrost Discovery Gateway (PDG), an online platform that aims to make big geospatial permafrost data accessible to enable knowledge-generation by researchers and the public. The PDG project creates various pan-Arctic data products down to the sub-meter and monthly resolution. Access the PDG Imagery Viewer here: https://arcticdata.io/catalog/portals/permafrost Data limitations in use: This data is part of an initial release of the pan-Arctic data product for ice-wedge polygons, and it is expected that there are constraints on its accuracy and completeness. Users are encouraged to provide feedback regarding how they use this data and issues they encounter during post-processing. Please reach out to the dataset contact or a member of the PDG team via support@arcticdata.io.
more » « less
4. This paper is motivated by a practical problem: many U.S. states have public hearings on "communities of interest" as part of their redistricting process, but no state has as yet adopted a concrete method of spatializing and aggregating community maps in order to take them into account in the drawing of new boundaries for electoral districts. Below, we describe a year-long project that collected and synthesized thousands of community maps through partnerships with grassroots organizations and/or government offices. The submissions were then aggregated by geographical clustering with a modified Hausdorff distance; then, the text from the narrative submissions was classified with semantic labels so that short runs of a Markov chain could be used to form semantic sub-clusters. The resulting dataset is publicly available, including the raw data of submitted community maps as well as post-processed community clusters and a scoring system for measuring how well districting plans respect the clusters. We provide a discussion of the strengths and weaknesses of this methodology and conclude with proposed directions for future work.
more » « less
5. Abstract Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges of immense social impact which have not been adequately addressed (1) Inference Challenge assuming that there are N census blocks (nodes) in the city, and given an initial infection at any set of nodes, e.g. any N of possible single node infections, any $$N(N-1)/2$$ N ( N - 1 ) / 2 of possible two node infections, etc, what is the probability for a subset of census blocks to become infected by the time the spread of the infection burst is stabilized? (2) Prevention Challenge What is the minimal control action one can take to minimize the infected part of the stabilized state footprint? To answer the challenges, we build a Graphical Model of pandemic of the attractive Ising (pair-wise, binary) type, where each node represents a census tract and each edge factor represents the strength of the pairwise interaction between a pair of nodes, e.g. representing the inter-node travel, road closure and related, and each local bias/field represents the community level of immunization, acceptance of the social distance and mask wearing practice, etc. Resolving the Inference Challenge requires finding the Maximum-A-Posteriory (MAP), i.e. most probable, state of the Ising Model constrained to the set of initially infected nodes. (An infected node is in the $$+ \, 1$$ + 1 state and a node which remained safe is in the $$- \, 1$$ - 1 state.) We show that almost all attractive Ising Models on dense graphs result in either of the two possibilities (modes) for the MAP state: either all nodes which were not infected initially became infected, or all the initially uninfected nodes remain uninfected (susceptible). This bi-modal solution of the Inference Challenge allows us to re-state the Prevention Challenge as the following tractable convex programming : for the bare Ising Model with pair-wise and bias factors representing the system without prevention measures, such that the MAP state is fully infected for at least one of the initial infection patterns, find the closest, for example in $$l_1$$ l 1 , $$l_2$$ l 2 or any other convexity-preserving norm, therefore prevention-optimal, set of factors resulting in all the MAP states of the Ising model, with the optimal prevention measures applied, to become safe. We have illustrated efficiency of the scheme on a quasi-realistic model of Seattle. Our experiments have also revealed useful features, such as sparsity of the prevention solution in the case of the $$l_1$$ l 1 norm, and also somehow unexpected features, such as localization of the sparse prevention solution at pair-wise links which are NOT these which are most utilized/traveled.
more » « less