Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Leitner, Stephan (Ed.)ObjectivePeer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors’ response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewersanchorto their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores. DesignWe design a novel randomized controlled trial to test if reviewers exhibit anchoring. In the experimental condition, participants initially see a flawed version of a paper that is corrected after they submit their initial review, while in the control condition, participants only see the correct version. We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group. Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose. ResultsOur randomized controlled trial consists of 108 researchers as participants. First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-WhitneyUstatistic, we find that the experimental group’s initial scores are lower than the control group’s scores in both the Evaluation category (Vargha-DelaneyA= 0.64,p= 0.0096) and Overall score (A= 0.59,p= 0.058). Next, we test for anchoring by comparing the experimental group’s revised scores with the control group’s scores. We find no significant evidence of anchoring in either the Overall (A= 0.50,p= 0.61) or Evaluation category (A= 0.49,p= 0.61). The Mann-WhitneyUrepresents the number of individual pairwise comparisons across groups in which the value from the specified group is stochastically greater, while the Vargha-DelaneyAis the normalized version in [0, 1].more » « lessFree, publicly-accessible full text available November 18, 2025
- 
            Collusion rings pose a significant threat to peer review. In these rings, reviewers who are also authors coordinate to manipulate paper assignments, often by strategically bidding on each other’s papers. A promising solution is to detect collusion through these manipulated bids, enabling conferences to take appropriate action. However, while methods exist for detecting other types of fraud, no research has yet shown that identifying collusion rings is feasible. In this work, we consider the question of whether it is feasible to detect collusion rings from the paper bidding. We conduct an empirical analysis of two realistic conference bidding datasets and evaluate existing algorithms for fraud detection in other applications. We find that collusion rings can achieve considerable success at manipulating the paper assignment while remaining hidden from detection: for example, in one dataset, undetected colluders are able to achieve assignment to up to 30% of the papers authored by other colluders. In addition, when 10 colluders bid on all of each other’s papers, no detection algorithm outputs a group of reviewers with more than 31% overlap with the true colluders. These results suggest that collusion cannot be effectively detected from the bidding using popular existing tools, demonstrating the need to develop more complex detection algorithms as well as those that leverage additional metadata (e.g., reviewer-paper text-similarity scores).more » « lessFree, publicly-accessible full text available December 31, 2025
- 
            Bailey, Henry Hugh (Ed.)Many peer-review processes involve reviewers submitting their independent reviews, followed by a discussion between the reviewers of each paper. A common question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this question by conducting a randomized controlled trial at the Conference on Uncertainty in Artificial Intelligence (UAI) 2022 conference where reviewer discussions were conducted over a typed forum. We randomly split the reviewers and papers into two conditions–one with anonymous discussions and the other with non-anonymous discussions. We also conduct an anonymous survey of all reviewers to understand their experience and opinions. We compare the two conditions in terms of the amount of discussion, influence of seniority on the final decisions, politeness, reviewers’ self-reported experiences and preferences. Overall, this experiment finds small, significant differences favoring the anonymous discussion setup based on the evaluation criteria considered in this work.more » « lessFree, publicly-accessible full text available December 27, 2025
- 
            Free, publicly-accessible full text available December 2, 2025
- 
            It is common to evaluate a set of items by soliciting people to rate them. For example, universities ask students to rate the teaching quality of their instructors, and conference organizers ask authors of submissions to evaluate the quality of the reviews. However, in these applications, students often give a higher rating to a course if they receive higher grades in a course, and authors often give a higher rating to the reviews if their papers are accepted to the conference. In this work, we call these external factors the" outcome" experienced by people, and consider the problem of mitigating these outcome-induced biases in the given ratings when some information about the outcome is available. We formulate the information about the outcome as a known partial ordering on the bias. We propose a debiasing method by solving a regularized optimization problem under this ordering constraint, and also provide a carefully designed cross-validation method that adaptively chooses the appropriate amount of regularization. We provide theoretical guarantees on the performance of our algorithm, as well as experimental evaluations.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available