NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews.

Liang, Weixin; Izzo, Zachary; Zhang, Yaohui; Lepp, Haley; Cao, Hancheng; Zhao, Xuandong; Chen, Lingjiao; Ye, Haotian; Liu, Sheng; Huang, Zhi; et al (July 2024, ICML conference.)

We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in AI conferences that took place after the release of ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates. The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals. We also observe corpus-level trends in generated text which may be too subtle to detect at the individual level, and discuss the implications of such trends on peer review. We call for future interdisciplinary work to examine how LLM use is changing our information and knowledge practices.
more » « less
Full Text Available
Can Large Language Models Provide Useful Feedback on Research Papers? A Large-Scale Empirical Analysis

https://doi.org/10.1056/AIoa2400196

Liang, Weixin; Zhang, Yuhui; Cao, Hancheng; Wang, Binglu; Ding, Daisy Yi; Yang, Xinyu; Vodrahalli, Kailas; He, Siyu; Smith, Daniel Scott; Yin, Yian; et al (July 2024, NEJM AI)

BACKGROUND Expert feedback lays the foundation of rigorous research. However, the rapid growth of scholarly production challenges the conventional scienti c feedback mechanisms. High-quality peer reviews are increasingly dif cult to obtain. METHODS We created an automated pipeline using Generative Pretrained Transformer 4 (GPT-4) to provide comments on scienti c papers. We evaluated the quality of GPT-4’s feedback through two large-scale studies. We rst quantitatively compared GPT-4’s gen- erated feedback with human peer reviewers’ feedback in general scienti c papers from 15 Nature family journals (3096 papers in total) and the International Conference on Learning Representations (ICLR) machine learning conference (1709 papers). To speci - cally assess GPT-4’s performance on biomedical papers, we also analyzed a subset of 425 health sciences papers from the Nature portfolio and a random sample of 666 sub- missions to eLife. Additionally, we conducted a prospective user study with 308 research- ers from 110 institutions in the elds of arti cial intelligence and computational biology to understand how researchers perceive feedback generated by our system on their own papers. RESULTS The overlap in the points raised by GPT-4 and by human reviewers (average overlap of 30.85% for Nature journals and 39.23% for ICLR) is comparable with the over- lap between two human reviewers (average overlap of 28.58% for Nature journals and 35.25% for ICLR). Results on eLife and a subset of health sciences papers as categorized by the Nature portfolio show similar patterns. In our prospective user study, more than half (57.4%) of the users found GPT-4–generated feedback helpful/very helpful, and 82.4% found it more bene cial than feedback from at least some human reviewers. We also identify several limitations of large language model (LLM)–generated feedback. CONCLUSIONS Through both retrospective and prospec- tive evaluation, we nd substantial overlap between LLM and human feedback as well as positive user perceptions regarding the usefulness of LLM feedback. Although human expert review should continue to be the foundation of the scienti c process, LLM feedback could bene t researchers, especially when timely expert feedback is not available and in earlier stages of manuscript preparation. (Funded by the Chan–Zuckerberg Initiative and the Stanford Interdisciplin- ary Graduate Fellowship.)
more » « less
Full Text Available
How New Ideas Diffuse in Science

https://doi.org/10.1177/00031224231166955

Cheng, Mengjie; Smith, Daniel Scott; Ren, Xiang; Cao, Hancheng; Smith, Sanne; McFarland, Daniel A. (June 2023, American Sociological Review)

What conditions enable novel intellectual contributions to diffuse and become integrated into later scientific work? Prior work tends to focus on whole cultural products, such as patents and articles, and emphasizes external social factors as important. This article focuses on concepts as reflections of ideas, and we identify the combined influence that social factors and internal intellectual structures have on ideational diffusion. To develop this perspective, we use computational techniques to identify nearly 60,000 new ideas introduced over two decades (1993 to 2016) in the Web of Science and follow their diffusion across 38 million later publications. We find new ideas diffuse more widely when they socially and intellectually resonate. New ideas become core concepts of science when they reach expansive networks of unrelated authors, achieve consistent intellectual usage, are associated with other prominent ideas, and fit with extant research traditions. These ecological conditions play an increasingly decisive role later in an idea’s career, after their relations with the environment are established. This work advances the systematic study of scientific ideas by moving beyond products to focus on the content of ideas themselves and applies a relational perspective that takes seriously the contingency of their success.
more » « less
Full Text Available
Gendered knowledge in fields and academic careers

https://doi.org/10.1016/j.respol.2021.104411

Kim, Lanu; Smith, Daniel Scott; Hofstra, Bas; McFarland, Daniel A. (January 2022, Research Policy)
null (Ed.)
Full Text Available
Creative Destruction: The Structural Consequences of Scientific Curation

https://doi.org/10.1177/0003122421996323

McMahan, Peter; McFarland, Daniel A. (April 2021, American Sociological Review)
null (Ed.)
Communication of scientific findings is fundamental to scholarly discourse. In this article, we show that academic review articles, a quintessential form of interpretive scholarly output, perform curatorial work that substantially transforms the research communities they aim to summarize. Using a corpus of millions of journal articles, we analyze the consequences of review articles for the publications they cite, focusing on citation and co-citation as indicators of scholarly attention. Our analysis shows that, on the one hand, papers cited by formal review articles generally experience a dramatic loss in future citations. Typically, the review gets cited instead of the specific articles mentioned in the review. On the other hand, reviews curate, synthesize, and simplify the literature concerning a research topic. Most reviews identify distinct clusters of work and highlight exemplary bridges that integrate the topic as a whole. These bridging works, in addition to the review, become a shorthand characterization of the topic going forward and receive disproportionate attention. In this manner, formal reviews perform creative destruction so as to render increasingly expansive and redundant bodies of knowledge distinct and comprehensible.
more » « less
Full Text Available
Education Data Science: Past, Present, Future

https://doi.org/10.1177/23328584211052055

McFarland, Daniel A.; Khanna, Saurabh; Domingue, Benjamin W.; Pardos, Zachary A. (January 2021, AERA Open)
null (Ed.)
This AERA Open special topic concerns the large emerging research area of education data science (EDS). In a narrow sense, EDS applies statistics and computational techniques to educational phenomena and questions. In a broader sense, it is an umbrella for a fleet of new computational techniques being used to identify new forms of data, measures, descriptives, predictions, and experiments in education. Not only are old research questions being analyzed in new ways but also new questions are emerging based on novel data and discoveries from EDS techniques. This overview defines the emerging field of education data science and discusses 12 articles that illustrate an AERA-angle on EDS. Our overview relates a variety of promises EDS poses for the field of education as well as the areas where EDS scholars could successfully focus going forward.
more » « less
Full Text Available
The Patterning of Collaborative Behavior and Knowledge Culminations in Interdisciplinary Research Centers

https://doi.org/10.1007/s11024-019-09381-6

Mäkinen, Elina I.; Evans, Eliza D.; McFarland, Daniel A. (March 2020, Minerva)
null (Ed.)
Full Text Available
The Diversity–Innovation Paradox in Science

https://doi.org/10.1073/pnas.1915378117

Hofstra, Bas; Kulkarni, Vivek V.; Munoz-Najar Galvez, Sebastian; He, Bryan; Jurafsky, Dan; McFarland, Daniel A. (April 2020, Proceedings of the National Academy of Sciences)

Prior work finds a diversity paradox: Diversity breeds innovation, yet underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a near-complete population of ∼1.2 million US doctoral recipients from 1977 to 2015 and following their careers into publishing and faculty positions. We use text analysis and machine learning to answer a series of questions: How do we detect scientific innovations? Are underrepresented groups more likely to generate scientific innovations? And are the innovations of underrepresented groups adopted and rewarded? Our analyses show that underrepresented groups produce higher rates of scientific novelty. However, their novel contributions are devalued and discounted: For example, novel contributions by gender and racial minorities are taken up by other scholars at lower rates than novel contributions by gender and racial majorities, and equally impactful contributions of gender and racial minorities are less likely to result in successful scientific careers than for majority groups. These results suggest there may be unwarranted reproduction of stratification in academic careers that discounts diversity’s role in innovation and partly explains the underrepresentation of some groups in academia.
more » « less
Full Text Available

Search for: All records