Abstract The Disruption Index (D-index) provides the first quantitative framework for identifying breakthroughs in science and technology. As its use expands, questions have emerged about its meaning, strengths, and limitations. Because the D-index measures how a focal paper competes with its references for citation attention, some worry that it is distorted by historical changes in citation practices. For example, if papers cite more references over time—a trend known as “citation inflation”—then newer papers might appear less disruptive even when equally inventive. We show that this concern is unfounded. Citation counts follow a long-tailed distribution, meaning competition is overwhelmingly shaped by the focal paper and its most-cited reference, while other references are negligible. Thus, the D-index captures whether a paper overturns a dominant idea in its field. The metric is fundamentally relational: It measures competition with predecessors rather than innovation in a vacuum. From this perspective, breakthroughs arise not only from generating novel ideas but also from replacing established ones—much like light bulbs replacing candles. We support this interpretation with mathematical analysis and large-scale bibliometric evidence.
more »
« less
The use of ChatGPT for identifying disruptive papers in science: a first exploration
Abstract ChatGPT has arrived in quantitative research evaluation. With the exploration in this Letter to the Editor, we would like to widen the spectrum of the possible use of ChatGPT in bibliometrics by applying it to identify disruptive papers. The identification of disruptive papers using publication and citation counts has become a popular topic in scientometrics. The disadvantage of the quantitative approach is its complexity in the computation. The use of ChatGPT might be an easy to use alternative.
more »
« less
- Award ID(s):
- 2239418
- PAR ID:
- 10571987
- Publisher / Repository:
- https://link.springer.com/article/10.1007/s11192-024-05176-z
- Date Published:
- Journal Name:
- Scientometrics
- Volume:
- 129
- Issue:
- 11
- ISSN:
- 0138-9130
- Page Range / eLocation ID:
- 7161 to 7165
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract ObjectiveThis Emerging Ideas report explores families' (parents and their children) uses and gratification for ChatGPT. BackgroundGenerative artificial intelligence–based conversational agents, such as ChatGPT, can be used to accomplish a variety of tasks, yet little is known about how and why parents and their children may use these technologies. MethodsWe conducted semistructured qualitative and exploratory interviews with 12 U.S.‐based families that had experience sharing a ChatGPT account. Families were recruited using social media advertisements, and at least one child and one parent joined the interview. We asked families about what they used ChatGPT for and why they used the platform. ResultsFamilies reported four main motivators for using ChatGPT: (a) information seeking, (b) enhancing productivity, (c) entertainment, and (d) social bonding. Potential barriers to use included concerns about (a) ChatGPT's credibility and capabilities, (b) being less familiar with using ChatGPT, (c) the platform's ethical implications, and (d) possible privacy risks. ConclusionFamilies use ChatGPT for various purposes, but their uses and gratifications sometimes may differ depending on their perceptions of and experiences with the platform. ImplicationsOur findings suggest that with some improvements, ChatGPT has the potential to be a useful tool for both individual and shared use in families.more » « less
-
Zervas, E. (Ed.)Recently, there has been a surge in general-purpose language models, with ChatGPT being the most advanced model to date. These models are primarily used for generating text in response to user prompts on various topics. It needs to be validated how accurate and relevant the generated text from ChatGPT is on the specific topics, as it is designed for general conversation and not for context-specific purposes. This study explores how ChatGPT, as a general-purpose model, performs in the context of a real-world challenge such as climate change compared to ClimateBert, a state-of-the-art language model specifically trained on climate-related data from various sources, including texts, news, and papers. ClimateBert is fine-tuned on five different NLP classification tasks, making it a valuable benchmark for comparison with the ChatGPT on various NLP tasks. The main results show that for climate-specific NLP tasks, ClimateBert outperforms ChatGPT.more » « less
-
Abstract BackgroundSystematic literature reviews (SLRs) are foundational for synthesizing evidence across diverse fields and are especially important in guiding research and practice in health and biomedical sciences. However, they are labor intensive due to manual data extraction from multiple studies. As large language models (LLMs) gain attention for their potential to automate research tasks and extract basic information, understanding their ability to accurately extract explicit data from academic papers is critical for advancing SLRs. ObjectiveOur study aimed to explore the capability of LLMs to extract both explicitly outlined study characteristics and deeper, more contextual information requiring nuanced evaluations, using ChatGPT (GPT-4). MethodsWe screened the full text of a sample of COVID-19 modeling studies and analyzed three basic measures of study settings (ie, analysis location, modeling approach, and analyzed interventions) and three complex measures of behavioral components in models (ie, mobility, risk perception, and compliance). To extract data on these measures, two researchers independently extracted 60 data elements using manual coding and compared them with the responses from ChatGPT to 420 queries spanning 7 iterations. ResultsChatGPT’s accuracy improved as prompts were refined, showing improvements of 33% and 23% between the initial and final iterations for extracting study settings and behavioral components, respectively. In the initial prompts, 26 (43.3%) of 60 ChatGPT responses were correct. However, in the final iteration, ChatGPT extracted 43 (71.7%) of the 60 data elements, showing better performance in extracting explicitly stated study settings (28/30, 93.3%) than in extracting subjective behavioral components (15/30, 50%). Nonetheless, the varying accuracy across measures highlighted its limitations. ConclusionsOur findings underscore LLMs’ utility in extracting basic as well as explicit data in SLRs by using effective prompts. However, the results reveal significant limitations in handling nuanced, subjective criteria, emphasizing the necessity for human oversight.more » « less
-
Abstract Emerging studies underscore the promising capabilities of large language model-based chatbots in conducting basic bioinformatics data analyses. The recent feature of accepting image inputs by ChatGPT, also known as GPT-4V(ision), motivated us to explore its efficacy in deciphering bioinformatics scientific figures. Our evaluation with examples in cancer research, including sequencing data analysis, multimodal network-based drug repositioning, and tumor clonal evolution, revealed that ChatGPT can proficiently explain different plot types and apply biological knowledge to enrich interpretations. However, it struggled to provide accurate interpretations when color perception and quantitative analysis of visual elements were involved. Furthermore, while the chatbot can draft figure legends and summarize findings from the figures, stringent proofreading is imperative to ensure the accuracy and reliability of the content.more » « less
An official website of the United States government

