- Publication Date:
- NSF-PAR ID:
- Journal Name:
- WWW '21: Proceedings of the Web Conference
- Sponsoring Org:
- National Science Foundation
More Like this
A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of future citations: the estimate of linguistic influence from the two years after a paper’s publication is correlated with and predictive of its citation count in the following three years. This is demonstrated using an online evaluation with incremental temporal training/test splits, in comparison with a strong baseline that includes predictors for initial citation counts, topics, and lexical features.
“Measuring the Evolution of a Scientific Field through Citation Frames.” Transactions of the Association for Computational Linguistics (TACL).Citations have long been used to characterize the state of a scientific field and to identify influential works. However, writers use citations for different purposes, and this varied purpose influences uptake by future scholars. Unfortunately, our understanding of how scholars use and frame citations has been limited to small-scale manual citation analysis of individual papers. We perform the largest behavioral study of citations to date, analyzing how scientific works frame their contributions through different types of citations and how this framing affects the field as a whole. We introduce a new dataset of nearly 2,000 citations annotated for their function, and use it to develop a state-of-the-art classifier and label the papers of an entire field: Natural Language Processing. We then show how differences in framing affect scientific uptake and reveal the evolution of the publication venues and the field as a whole. We demonstrate that authors are sensitive to discourse structure and publication venue when citing, and that how a paper frames its work through citations is predictive of the citation count it will receive. Finally, we use changes in citation framing to show that the field of NLP is undergoing a significant increase in consensus.
Explaining and predicting the impact of authors within a community: an assessment of the bibliometric literature and application of machine learning
Following widespread availability of computerized databases, much research has correlated bibliometric measures from papers or patents to subsequent success, typically measured as the number of publications or citations. Building on this large body of work, we ask the following questions: given available bibliometric information in one year, along with the combined theories on sources of creative breakthroughs from the literatures on creativity and innovation, how accurately can we explain the impact of authors in a given research community in the following year? In particular, who is most likely to publish, publish highly cited work, and even publish a highly cited outlier? And, how accurately can these existing theories predict breakthroughs using only contemporaneous data? After reviewing and synthesizing (often competing) theories from the literatures, we simultaneously model the collective hypotheses based on available data in the year before RNA interference was discovered. We operationalize author impact using publication count, forward citations, and the more stringent definition of being in the top decile of the citation distribution. Explanatory power of current theories altogether ranges from less than 9% for being top cited to 24% for productivity. Machine learning (ML) methods yield similar findings as the explanatory linear models, and tangiblemore »
Pain and Laboratory Animals: Publication Practices for Better Data Reproducibility and Better Animal WelfareScientists who perform major survival surgery on laboratory animals face a dual welfare and methodological challenge: how to choose surgical anesthetics and post-operative analgesics that will best control animal suffering, knowing that both pain and the drugs that manage pain can all affect research outcomes. Scientists who publish full descriptions of animal procedures allow critical and systematic reviews of data, demonstrate their adherence to animal welfare norms, and guide other scientists on how to conduct their own studies in the field. We investigated what information on animal pain management a reasonably diligent scientist might find in planning for a successful experiment. To explore how scientists in a range of fields describe their management of this ethical and methodological concern, we scored 400 scientific articles that included major animal survival surgeries as part of their experimental methods, for the completeness of information on anesthesia and analgesia. The 400 articles (250 accepted for publication pre-2011, and 150 in 2014–15, along with 174 articles they reference) included thoracotomies, craniotomies, gonadectomies, organ transplants, peripheral nerve injuries, spinal laminectomies and orthopedic procedures in dogs, primates, swine, mice, rats and other rodents. We scored articles for Publication Completeness (PC), which was any mention of use ofmore »
Searching for relevant literature is a fundamental part of academic research. The search for relevant literature is becoming a more difficult and time-consuming task as millions of articles are published each year. As a solution, recommendation systems for academic papers attempt to help researchers find relevant papers quickly. This paper focuses on graph-based recommendation systems for academic papers using citation networks. This type of paper recommendation system leverages a graph of papers linked by citations to create a list of relevant papers. In this study, we explore recommendation systems for academic papers using citation networks incorporating citation relations. We define citation relation based on the number of times the origin paper cites the reference paper, and use this citation relation to measure the strength of the relation between the papers. We created a weighted network using citation relation as citation weight on edges. We evaluate our proposed method on a real-world publication data set, and conduct an extensive comparison with three state-of-the-art baseline methods. Our results show that citation network-based recommendation systems using citation weights perform better than the current methods.