Predicting Long-Term Citations from Short-Term Linguistic Influence

Soni, Sandeep; Bamman, David; Eisenstein, Jacob

Citation Details

A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of future citations: the estimate of linguistic influence from the two years after a paper’s publication is correlated with and predictive of its citation count in the following three years. This is demonstrated using an online evaluation with incremental temporal training/test splits, in comparison with a strong baseline that includes predictors for initial citation counts, topics, and lexical features. more »

Award ID(s):: 1813470 1942591

PAR ID:: 10382963

Author(s) / Creator(s):: Soni, Sandeep; Bamman, David; Eisenstein, Jacob

Date Published:: 2022-01-01

Journal Name:: Findings of the Association for Computational Linguistics: EMNLP 2022

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this