skip to main content


Title: Crowdsourcing Perceptions of Gerrymandering
Gerrymandering is the manipulation of redistricting to influence the results of a set of elections for local representatives. Gerrymandering has the potential to drastically swing power in legislative bodies even with no change in a population’s political views. Identifying gerrymandering and measuring fairness using metrics of proposed district plans is a topic of current research, but there is less work on how such plans will be perceived by voters. Gathering data on such perceptions presents several challenges such as the ambiguous definitions of ‘fair’ and the complexity of real world geography and district plans. We present a dataset collected from an online crowdsourcing platform on a survey asking respondents to mark which of two maps of equal population distribution but different districts appear more ‘fair’ and the reasoning for their decision. We performed preliminary analysis on this data and identified which of several commonly suggested metrics are most predictive of the responses. We found that the maximum perimeter of any district was the most predictive metric, especially with participants who reported that they made their decision based on the shape of the districts.  more » « less
Award ID(s):
1453542 2106983 2007994
NSF-PAR ID:
10388996
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Human Computation and Crowdsourcing
Volume:
10
Issue:
1
ISSN:
2769-1330
Page Range / eLocation ID:
124 to 132
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Berry, Jonathan ; Shmoys, David ; Cowen, Lenore ; Naumann, Uwe (Ed.)
    In the United States, regions (such as states or counties) are frequently divided into districts for the purpose of electing representatives. How the districts are drawn can have a profound effect on who's elected, and drawing the districts to give an advantage to a certain group is known as gerrymandering. It can be surprisingly difficult to detect when gerrymandering is occurring, but one algorithmic method is to compare a current districting plan to a large number of randomly sampled plans to see whether it is an outlier. Recombination Markov chains are often used to do this random sampling: randomly choose two districts, consider their union, and split this union up in a new way. This approach works well in practice and has been widely used, including in litigation, but the theory behind it remains underdeveloped. For example, it's not known if recombination Markov chains are irreducible, that is, if recombination moves suffice to move from any districting plan to any other. Irreducibility of recombination Markov chains can be formulated as a graph problem: for a planar graph G, is the space of all partitions of G into κ connected subgraphs (κ districts) connected by recombination moves? While the answer is yes when districts can be as small as one vertex, this is not realistic in real-world settings where districts must have approximately balanced populations. Here we fix district sizes to be κ1 ± 1 vertices, κ2 ± 1 vertices,… for fixed κ1, κ2,…, a more realistic setting. We prove for arbitrarily large triangular regions in the triangular lattice, when there are three simply connected districts, recombination Markov chains are irreducible. This is the first proof of irreducibility under tight district size constraints for recombination Markov chains beyond small or trivial examples. The triangular lattice is the most natural setting in which to first consider such a question, as graphs representing states/regions are frequently triangulated. The proof uses a sweep-line argument, and there is hope it will generalize to more districts, triangulations satisfying mild additional conditions, and other redistricting Markov chains. 
    more » « less
  2. Let f be a drawing in the Euclidean plane of a graph G, which is understood to be a 1-dimensional simplicial complex. We assume that every edge of G is drawn by f as a curve of constant algebraic complexity, and the ratio of the length of the longest simple path to the the length of the shortest edge is poly(n). In the drawing f, a path P of G, or its image in the drawing π = f(P), is β-stretch if π is a simple (non-self-intersecting) curve, and for every pair of distinct points p ∈ P and q ∈ P , the length of the sub-curve of π connecting f(p) with f(q) is at most β∥f(p) − f(q)∥, where ∥.∥ denotes the Euclidean distance. We introduce and study the β-stretch Path Problem (βSP for short), in which we are given a pair of vertices s and t of G, and we are to decide whether in the given drawing of G there exists a β-stretch path P connecting s and t. We also output P if it exists. The βSP quantifies a notion of “near straightness” for paths in a graph G, motivated by gerrymandering regions in a map, where edges of G represent natural geographical/political boundaries that may be chosen to bound election districts. The notion of a β-stretch path naturally extends to cycles, and the extension gives a measure of how gerrymandered a district is. Furthermore, we show that the extension is closely related to several studied measures of local fatness of geometric shapes. We prove that βSP is strongly NP-complete. We complement this result by giving a quasi-polynomial time algorithm, that for a given ε > 0, β ∈ O(poly(log |V (G)|)), and s, t ∈ V (G), outputs a β-stretch path between s and t, if a (1 − ε)β-stretch path between s and t exists in the drawing. 
    more » « less
  3. To audit political district maps for partisan gerrymandering, one may determine a baseline for the expected distribution of partisan outcomes by sampling an ensemble of maps. One approach to sampling is to use redistricting policy as a guide to precisely codify preferences between maps. Such preferences give rise to a probability distribution on the space of redistricting plans, and Metropolis-Hastings methods allow one to sample ensembles of maps from the specified distribution. Although these approaches have nice theoretical properties and have successfully detected gerrymandering in legal settings, sampling from commonly-used policy-driven distributions is often computationally difficult. As of yet, there is no algorithm that can be used off-the-shelf for checking maps under generic redistricting criteria. In this work, we mitigate the computational challenges in a Metropolized-sampling technique through a parallel tempering method combined with ReCom[11] and, for the first time, validate that such techniques are effective on these problems at the scale of statewide precinct graphs for more policy informed measures. We develop these improvements through the first case study of district plans in Georgia. Our analysis projects that any election in Georgia will reliably elect 9 Republicans and 5 Democrats under the enacted plan. This result is largely fixed even as public opinion shifts toward either party and the partisan outcome of the enacted plan does not respond to the will of the people. Only 0.12% of the ∼ 160K plans in our ensemble were similarly non-responsive. 
    more » « less
  4. null (Ed.)
    Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a back-end for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing k-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical k-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness. 
    more » « less
  5. Feng, Mingyu ; Käser, Tanja ; Talukdar, Partha (Ed.)
    Recent research seeks to develop more comprehensive learner models for adaptive learning software. For example, models of reading comprehension built using data from students’ use of adaptive instructional software for mathematics have recently been developed. These models aim to deliver experiences that consider factors related to learning beyond performance in the target domain for instruction. We investigate the extent to which generalization is possible for a recently developed predictive model that seeks to infer students’ reading comprehension ability (as measured by end-of-year standardized test scores) using an introductory learning experience in Carnegie Learning’s MATHia intelligent tutoring system for mathematics. Building on a model learned on data from middle school students in a single school district in a mid-western U.S. state, using that state’s end-of-year English Language Arts (ELA) standardized test score as an outcome, we consider data from a school district in a south-eastern U.S. state as well as that state’s end-of-year ELA standardized test outcome. Generalization is explored by considering prediction performance when training and testing models on data from each of the individual school districts (and for their respective state’s test outcomes) as well as pooling data from both districts together. We conclude with discussion of investigations of some algorithmic fairness characteristics of the learned models. The results suggest that a model trained on data from the smaller of the two school districts considered may achieve greater fairness in its predictions over models trained on data from the other district or both districts, despite broad, overall similarities in some demographic characteristics of the two school districts. This raises interesting questions for future research on generalizing these kinds of models as well as on ensuring algorithmic fairness of resulting models for use in real-world adaptive systems for learning. 
    more » « less