skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 1, 2026

Title: Who Reviews The Reviewers? A Multi-Level Jury Problem
We consider the problem of determining a binary ground truth using advice from a group of independent reviewers (experts) who express their guess about a ground truth correctly with some independent probability (competence) 𝑝 . In this setting, when all reviewers 𝑖 are competent with 𝑝 β‰₯ 0.5, the Condorcet Jury Theorem tells us that adding more reviewers increases the overall accuracy, and if all 𝑝 ’s are known, then there exists an optimal weighting of the 𝑖 reviewers. However, in practical settings, reviewers may be noisy or incompetent, i.e., 𝑝𝑖 ≀ 0.5, and the number of experts may be small, so the asymptotic Condorcet Jury Theorem is not practically relevant. In such cases we explore appointing one or more chairs ( judges) who determine the weight of each reviewer for aggregation, creating multiple levels. However, these chairs may be unable to correctly identify the competence of the reviewers they oversee, and therefore unable to compute the optimal weighting. We give conditions on when a set of chairs is able to weight the reviewers optimally, and depending on the competence distribution of the agents, give results about when it is better to have more chairs or more reviewers. Through simulations we show that in some cases it is better to have more chairs, but in many cases it is better to have more reviewers.  more » « less
Award ID(s):
2339880 2007955
PAR ID:
10579702
Author(s) / Creator(s):
; ;
Publisher / Repository:
24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
Date Published:
Format(s):
Medium: X
Location:
Detroit, MI, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the problem of determining a binary ground truth using advice from a group of independent reviewers (experts) who express their guess about a ground truth correctly with some independent probability (competence) $$p_i$$. In this setting, when all reviewers are competent with $$p \geq 0.5$$, the Condorcet Jury Theorem tells us that adding more reviewers increases the overall accuracy, and if all $$p_i$$'s are known, then there exists an optimal weighting of the reviewers. However, in practical settings, reviewers may be noisy or incompetent, i.e., $$p_i \leq 0.5$$, and the number of experts may be small, so the asymptotic Condorcet Jury Theorem is not practically relevant. In such cases we explore appointing one or more chairs (judges) who determine the weight of each reviewer for aggregation, creating multiple levels. However, these chairs may be unable to correctly identify the competence of the reviewers they oversee, and therefore unable to compute the optimal weighting. We give conditions on when a set of chairs is able to weight the reviewers optimally, and depending on the competence distribution of the agents, give results about when it is better to have more chairs or more reviewers. Through simulations we show that in some cases it is better to have more chairs, but in many cases it is better to have more reviewers. 
    more » « less
  2. We investigate the problem of determining a binary ground truth using advice from a group of independent reviewers (experts) who express their guess about a ground truth correctly with some independent probability (competence) p_i. In this setting, when all reviewers are competent with p >= 0.5, the Condorcet Jury Theorem tells us that adding more reviewers increases the overall accuracy, and if all p_i's are known, then there exists an optimal weighting of the reviewers. However, in practical settings, reviewers may be noisy or incompetent, i.e., p_i < 0.5, and the number of experts may be small, so the asymptotic Condorcet Jury Theorem is not practically relevant. In such cases we explore appointing one or more chairs (judges) who determine the weight of each reviewer for aggregation, creating multiple levels. However, these chairs may be unable to correctly identify the competence of the reviewers they oversee, and therefore unable to compute the optimal weighting. We give conditions when a set of chairs is able to weight the reviewers optimally, and depending on the competence distribution of the agents, give results about when it is better to have more chairs or more reviewers. Through numerical simulations we show that in some cases it is better to have more chairs, but in many cases it is better to have more reviewers. 
    more » « less
  3. The law expects jurors to weigh the facts and evidence of a case to inform the decision with which they are charged. However, evidence in legal cases is becoming increasingly complicated, and studies have raised questions about laypeople’s abilities to understand and use complex evidence to inform decisions. Compared to other studies that have looked at general evidence comprehension and expert credibility (e.g. Schweitzer & Saks, 2012), this experimental study investigated whether jurors can appropriately weigh strong vs. weak DNA evidence without special assistance. That is, without help to understand when DNA evidence is relatively weak, are jurors sensitive to the strength of weak DNA evidence as compared to strong DNA evidence? Responses from jury-eligible participants (N=346) were collected from Amazon Mechanical Turk (MTurk). Participants were presented with a summary of a robbery case before being asked a short questionnaire related to verdict preference and evidence comprehension. (Data is from the pilot of experiment 2 for the grant project). We hypothesized participants would not be able to distinguish high- from low-quality DNA evidence. We analyzed the data using Bayes Factors, which allows for directly testing the null hypothesis (Zyphur & Oswald, 2013). A Bayes Factor of 4-8 (depending on the priors used) was found supporting the null for participants’ rating of low vs. high quality scientific evidence. A Bayes Factor of 4 means that the null is four times as probable as an alternative hypothesis. Participants tended to rate the DNA evidence as β€œhigh quality” no matter the condition they were in. The Bayes Factor of 4-8 in this case gives good reason to believe that jury members are unable to discern what constitutes low quality DNA evidence without assistance. If jurors are unable to distinguish between different qualities of evidence, or if they are unaware that they may have to, they could give greater weight to low quality scientific evidence than is warranted. The current study supports the hypothesis that jurors have trouble distinguishing between complicated high vs. low quality evidence without help. Further attempts will be made to discover ways of presenting DNA evidence that could better calibrate jurors in their decisions. These future directions involve larger sample sizes in which jury-eligible participants will complete the study in person. Instead of reading about the evidence, they will watch a filmed mock jury trial. This plan also involves jury deliberation which will provide additional knowledge about how jurors come to conclusions as a group about different qualities of evidence. Acknowledging the potential issues in jury trials and working to solve these problems is a vital step in improving our justice system. 
    more » « less
  4. Sequential learning models situations where agents predict a ground truth in sequence, by using their private, noisy measurements, and the predictions of agents who came earlier in the sequence. We study sequential learning in a social network, where agents only see the actions of the previous agents in their own neighborhood. The fraction of agents who predict the ground truth correctly depends heavily on both the network topology and the ordering in which the predictions are made. A natural question is to find an ordering, with a given network, to maximize the (expected) number of agents who predict the ground truth correctly. In this paper, we show that it is in fact NP-hard to answer this question for a general network, with both the Bayesian learning model and a simple majority rule model. Finally, we show that even approximating the answer is hard. 
    more » « less
  5. In this note we introduce a pseudometric on closed convex planar curves based on distances between normal lines and show its basic properties. Then we use this pseudometric to give a shorter proof of the theorem by Pinchasi that the sum of perimeters of π‘˜ convex planar bodies with disjoint interiors contained in a convex body of perimeter 𝑝 and diameter 𝑑 is not greater than 𝑝 + 2(π‘˜ βˆ’ 1)𝑑. 
    more » « less