NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Testing for reviewer anchoring in peer review: A randomized controlled trial

https://doi.org/10.1371/journal.pone.0301111

Liu, Ryan; Jecmen, Steven; Conitzer, Vincent; Fang, Fei; Shah, Nihar B (November 2024, PLOS ONE)
Leitner, Stephan (Ed.)
ObjectivePeer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors’ response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewersanchorto their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores. DesignWe design a novel randomized controlled trial to test if reviewers exhibit anchoring. In the experimental condition, participants initially see a flawed version of a paper that is corrected after they submit their initial review, while in the control condition, participants only see the correct version. We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group. Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose. ResultsOur randomized controlled trial consists of 108 researchers as participants. First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-WhitneyUstatistic, we find that the experimental group’s initial scores are lower than the control group’s scores in both the Evaluation category (Vargha-DelaneyA= 0.64,p= 0.0096) and Overall score (A= 0.59,p= 0.058). Next, we test for anchoring by comparing the experimental group’s revised scores with the control group’s scores. We find no significant evidence of anchoring in either the Overall (A= 0.50,p= 0.61) or Evaluation category (A= 0.49,p= 0.61). The Mann-WhitneyUrepresents the number of individual pairwise comparisons across groups in which the value from the specified group is stochastically greater, while the Vargha-DelaneyAis the normalized version in [0, 1].
more » « less
Free, publicly-accessible full text available November 18, 2025
On the Detection of Reviewer-Author Collusion Rings From Paper Bidding

Jecmen, Steven; Shah, Nihar; Fang, Fei; Akoglu, Leman (December 2024, Transactions on machine learning research)

Collusion rings pose a significant threat to peer review. In these rings, reviewers who are also authors coordinate to manipulate paper assignments, often by strategically bidding on each other’s papers. A promising solution is to detect collusion through these manipulated bids, enabling conferences to take appropriate action. However, while methods exist for detecting other types of fraud, no research has yet shown that identifying collusion rings is feasible. In this work, we consider the question of whether it is feasible to detect collusion rings from the paper bidding. We conduct an empirical analysis of two realistic conference bidding datasets and evaluate existing algorithms for fraud detection in other applications. We find that collusion rings can achieve considerable success at manipulating the paper assignment while remaining hidden from detection: for example, in one dataset, undetected colluders are able to achieve assignment to up to 30% of the papers authored by other colluders. In addition, when 10 colluders bid on all of each other’s papers, no detection algorithm outputs a group of reviewers with more than 31% overlap with the true colluders. These results suggest that collusion cannot be effectively detected from the bidding using popular existing tools, demonstrating the need to develop more complex detection algorithms as well as those that leverage additional metadata (e.g., reviewer-paper text-similarity scores).
more » « less
Free, publicly-accessible full text available December 31, 2025
A randomized controlled trial on anonymizing reviewers to each other in peer review discussions

https://doi.org/10.1371/journal.pone.0315674

Rastogi, Charvi; Song, Xiangchen; Jin, Zhijing; Stelmakh, Ivan; Daumé, Hal; Zhang, Kun; Shah, Nihar B (December 2024, PLOS ONE)
Bailey, Henry Hugh (Ed.)
Many peer-review processes involve reviewers submitting their independent reviews, followed by a discussion between the reviewers of each paper. A common question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this question by conducting a randomized controlled trial at the Conference on Uncertainty in Artificial Intelligence (UAI) 2022 conference where reviewer discussions were conducted over a typed forum. We randomly split the reviewers and papers into two conditions–one with anonymous discussions and the other with non-anonymous discussions. We also conduct an anonymous survey of all reviewers to understand their experience and opinions. We compare the two conditions in terms of the amount of discussion, influence of seniority on the final decisions, politeness, reviewers’ self-reported experiences and preferences. Overall, this experiment finds small, significant differences favoring the anonymous discussion setup based on the evaluation criteria considered in this work.
more » « less
Free, publicly-accessible full text available December 27, 2025
On the Detection of Reviewer-Author Collusion Rings From Paper Bidding

Jecmen, Steven; Shah, Nihar; Fang, Fei; Akoglu, Leman (December 2024, Transactions on machine learning research)

Free, publicly-accessible full text available December 2, 2025
Debiasing Evaluations That Are Biased by Evaluations

Wang, Jingyan; Stelmakh, Ivan; Wei, Yuting; Shah, Nihar (February 2024, Journal of machine learning research)

It is common to evaluate a set of items by soliciting people to rate them. For example, universities ask students to rate the teaching quality of their instructors, and conference organizers ask authors of submissions to evaluate the quality of the reviews. However, in these applications, students often give a higher rating to a course if they receive higher grades in a course, and authors often give a higher rating to the reviews if their papers are accepted to the conference. In this work, we call these external factors the" outcome" experienced by people, and consider the problem of mitigating these outcome-induced biases in the given ratings when some information about the outcome is available. We formulate the information about the outcome as a known partial ordering on the bias. We propose a debiasing method by solving a regularized optimization problem under this ordering constraint, and also provide a carefully designed cross-validation method that adaptively chooses the appropriate amount of regularization. We provide theoretical guarantees on the performance of our algorithm, as well as experimental evaluations.
more » « less
Full Text Available
Counterfactual Evaluation of Peer-Review Assignment Policies

Saveski, Martin; Jecmen, Steven; Shah, Nihar; Ugander, Johan (December 2023, Advances in neural information processing systems)

Full Text Available
Assisting Human Decisions in Document Matching

Kim, Joon Sik; Chen, Valerie; Pruthi, Danish; Shah, Nihar B.; Talwalkar, Ameet. (July 2023, Transactions on machine learning research)

Full Text Available
No Rose for MLE: Inadmissibility of MLE for Evaluation Aggregation Under Levels of Expertise

https://doi.org/10.1109/ISIT50566.2022.9834340

Rastogi, Charvi; Stelmakh, Ivan; Shah, Nihar; Balakrishnan, Sivaraman (June 2022, international symposium on information theory)

Full Text Available
Challenges, experiments, and computational solutions in peer review

https://doi.org/10.1145/3528086

Shah, Nihar B. (June 2022, Communications of the ACM)

Improving the peer review process in a scientific manner shows promise.
more » « less
Full Text Available
Integrating Rankings into Quantized Scores in Peer Review

Liu, Yusha; Xu, Yichong; Shah, Nihar; Singh, Aarti (January 2022, Transactions on machine learning research)

Full Text Available

« Prev Next »

Search for: All records