skip to main content


Title: Lower Bounds on Kemeny Rank Aggregation with Non-Strict Rankings
Rank aggregation has many applications in computer science, operations research, and group decision-making. This paper introduces lower bounds on the Kemeny aggregation problem when the input rankings are non-strict (with and without ties). It generalizes some of the existing lower bounds for strict rankings to the case of non-strict rankings, and it proposes shortcuts for reducing the run time of these techniques. More specifically, we use Condorcet criterion variations and the Branch & Cut method to accelerate the lower bounding process.  more » « less
Award ID(s):
1850355
NSF-PAR ID:
10315499
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2021 IEEE Symposium Series on Computational Intelligence
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The problem of rank aggregation from pairwise and multiway comparisons has a wide range of implications, ranging from recommendation systems to sports rankings to social choice. Some of the most popular algorithms for this problem come from the class of spectral ranking algorithms; these include the rank centrality (RC) algorithm for pairwise comparisons, which returns consistent estimates under the Bradley-Terry-Luce (BTL) model for pairwise comparisons (Negahban et al., 2017), and its generalization, the Luce spectral ranking (LSR) algorithm, which returns consistent estimates under the more general multinomial logit (MNL) model for multiway comparisons (Maystre & Grossglauser, 2015). In this paper, we design a provably faster spectral ranking algorithm, which we call accelerated spectral ranking (ASR), that is also consistent under the MNL/BTL models. Our accelerated algorithm is achieved by designing a random walk that has a faster mixing time than the random walks associated with previous algorithms. In addition to a faster algorithm, our results yield improved sample complexity bounds for recovery of the MNL/BTL parameters: to the best of our knowledge, we give the first general sample complexity bounds for recovering the parameters of the MNL model from multiway comparisons under any (connected) comparison graph (and improve significantly over previous bounds for the BTL model for pairwise comparisons). We also give a message-passing interpretation of our algorithm, which suggests a decentralized distributed implementation. Our experiments on several real-world and synthetic datasets confirm that our new ASR algorithm is indeed orders of magnitude faster than existing algorithms. 
    more » « less
  2. The problem of rank aggregation from pairwise and multiway comparisons has a wide range of implications, ranging from recommendation systems to sports rankings to social choice. Some of the most popular algorithms for this problem come from the class of spectral ranking algorithms; these include the rank centrality (RC) algorithm for pairwise comparisons, which returns consistent estimates under the Bradley-Terry-Luce (BTL) model for pairwise comparisons (Negahban et al., 2017), and its generalization, the Luce spectral ranking (LSR) algorithm, which returns consistent estimates under the more general multinomial logit (MNL) model for multiway comparisons (Maystre & Grossglauser, 2015). In this paper, we design a provably faster spectral ranking algorithm, which we call accelerated spectral ranking (ASR), that is also consistent under the MNL/BTL models. Our accelerated algorithm is achieved by designing a random walk that has a faster mixing time than the random walks associated with previous algorithms. In addition to a faster algorithm, our results yield improved sample complexity bounds for recovery of the MNL/BTL parameters: to the best of our knowledge, we give the first general sample complexity bounds for recovering the parameters of the MNL model from multiway comparisons under any (connected) comparison graph (and improve significantly over previous bounds for the BTL model for pairwise comparisons). We also give a message-passing interpretation of our algorithm, which suggests a decentralized distributed implementation. Our experiments on several real-world and synthetic datasets confirm that our new ASR algorithm is indeed orders of magnitude faster than existing algorithms. 
    more » « less
  3. In social choice, traditional Kemeny rank aggregation combines the preferences of voters, expressed as rankings, into a single consensus ranking without consideration for how this ranking may unfairly affect marginalized groups (i.e., racial or gender). Developing fair rank aggregation methods is critical due to their societal influence in applications prioritizing job applicants, funding proposals, and scheduling medical patients. In this work, we introduce the Fair Exposure Kemeny Aggregation Problem (FairExp-kap) for combining vast and diverse voter preferences into a single ranking that is not only a suitable consensus, but ensures opportunities are not withheld from marginalized groups. In formalizing FairExp-kap, we extend the fairness of exposure notion from information retrieval to the rank aggregation context and present a complimentary metric for voter preference representation. We design algorithms for solving FairExp-kap that explicitly account for position bias, a common ranking-based concern that end-users pay more attention to higher ranked candidates. epik solves FairExp-kap exactly by incorporating non-pairwise fairness of exposure into the pairwise Kemeny optimization; while the approximate epira is a candidate swapping algorithm, that guarantees ranked candidate fairness. Utilizing comprehensive synthetic simulations and six real-world datasets, we show the efficacy of our approach illustrating that we succeed in mitigating disparate group exposure unfairness in consensus rankings, while maximally representing voter preferences. 
    more » « less
  4. Instant runoff voting (IRV) is an increasingly-popular alternative to traditional plurality voting in which voters submit rankings over the candidates rather than single votes. In practice, elections using IRV often restrict the ballot length, the number of candidates a voter is allowed to rank on their ballot. We theoretically and empirically analyze how ballot length can influence the outcome of an election, given fixed voter preferences. We show that there exist preference profiles over k candidates such that up to k-1 different candidates win at different ballot lengths. We derive exact lower bounds on the number of voters required for such profiles and provide a construction matching the lower bound for unrestricted voter preferences. Additionally, we characterize which sequences of winners are possible over ballot lengths and provide explicit profile constructions achieving any feasible winner sequence. We also examine how classic preference restrictions influence our results—for instance, single-peakedness makes k-1 different winners impossible but still allows at least Ω(√k). Finally, we analyze a collection of 168 real-world elections, where we truncate rankings to simulate shorter ballots. We find that shorter ballots could have changed the outcome in one quarter of these elections. Our results highlight ballot length as a consequential degree of freedom in the design of IRV elections. 
    more » « less
  5. A number of applications require two-sample testing on ranked preference data. For instance, in crowdsourcing, there is a long-standing question of whether pairwise comparison data provided by people is distributed similar to ratings-converted-to-comparisons. Other examples include sports data analysis and peer grading. In this paper, we design two-sample tests for pairwise comparison data and ranking data. For our two-sample test for pairwise comparison data, we establish an upper bound on the sample complexity required to correctly distinguish between the distributions of the two sets of samples. Our test requires essentially no assumptions on the distributions. We then prove complementary lower bounds showing that our results are tight (in the minimax sense) up to constant factors. We investigate the role of modeling assumptions by proving lower bounds for a range of pairwise comparison models (WST, MST, SST, parameter-based such as BTL and Thurstone). We also provide testing algorithms and associated sample complexity bounds for the problem of two-sample testing with partial (or total) ranking data. Furthermore, we empirically evaluate our results via extensive simulations as well as two real-world datasets consisting of pairwise comparisons. By applying our two-sample test on real-world pairwise comparison data, we conclude that ratings and rankings provided by people are indeed distributed differently. On the other hand, our test recognizes no significant difference in the relative performance of European football teams across two seasons. Finally, we apply our two-sample test on a real-world partial and total ranking dataset and find a statistically significant difference in Sushi preferences across demographic divisions based on gender, age and region of residence. 
    more » « less