skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Code sharing in ecology and evolution increases citation rates but remains uncommon
Abstract Biologists increasingly rely on computer code to collect and analyze their data, reinforcing the importance of published code for transparency, reproducibility, training, and a basis for further work. Here, we conduct a literature review estimating temporal trends in code sharing in ecology and evolution publications since 2010, and test for an influence of code sharing on citation rate. We find that code is rarely published (only 6% of papers), with little improvement over time. We also found there may be incentives to publish code: Publications that share code have tended to be low‐impact initially, but accumulate citations faster, compensating for this deficit. Studies that additionally meet other Open Science criteria, open‐access publication, or data sharing, have still higher citation rates, with publications meeting all three criteria (code sharing, data sharing, and open access publication) tending to have the most citations and highest rate of citation accumulation.  more » « less
Award ID(s):
2225078
PAR ID:
10537119
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Ecology and Evolution
Volume:
14
Issue:
8
ISSN:
2045-7758
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Accurate prediction of scientific impact is important for scientists, academic recommender systems, and granting organizations alike. Existing approaches rely on many years of leading citation values to predict a scientific paper’s citations (a proxy for impact), even though most papers make their largest contributions in the first few years after they are published. In this paper, we tackle a new problem: predicting a new paper’s citation time series from the date of publication (i.e., without leading values). We propose HINTS, a novel end-to-end deep learning framework that converts citation signals from dynamic heterogeneous information networks (DHIN) into citation time series. HINTS imputes pseudo-leading values for a paper in the years before it is published from DHIN embeddings, and then transforms these embeddings into the parameters of a formal model that can predict citation counts immediately after publication. Empirical analysis on two real-world datasets from Computer Science and Physics show that HINTS is competitive with baseline citation prediction models. While we focus on citations, our approach generalizes to other “cold start” time series prediction tasks where relational data is available and accurate prediction in early timestamps is crucial. 
    more » « less
  2. Abstract MotivationDevelopment of bioinformatics methods is a long, complex and resource-hungry process. Hundreds of these tools were released. While some methods are highly cited and used, many suffer relatively low citation rates. We empirically analyze a large collection of recently released methods in three diverse protein function and disorder prediction areas to identify key factors that contribute to increased citations. ResultsWe show that provision of a working web server significantly boosts citation rates. On average, methods with working web servers generate three times as many citations compared to tools that are available as only source code, have no code and no server, or are no longer available. This observation holds consistently across different research areas and publication years. We also find that differences in predictive performance are unlikely to impact citation rates. Overall, our empirical results suggest that a relatively low-cost investment into the provision and long-term support of web servers would substantially increase the impact of bioinformatics tools. 
    more » « less
  3. Information about individual publications associated with grants funded by NSF to support SES research from 2000-2015 (see "SES grants, 2000-2015"). For grants with ten or fewer publications, we included information about all available publications in this dataset. For grants with more than ten publications, we randomly selected ten to include in this dataset. CSV file with 13 columns and names in header row: "Grant ID" is the ID from the Dimensions platform (string); "Grant Number" is the NSF Award number (integer); "Publication Title" is the title of the paper (text); "Publication Year" is the year in which the paper was published (year); "Authors" is a list or abbreviated list of the authors of the paper (text); "Journal" is the name of the scientific journal or outlet in which the paper is published (text); "Interdis Rubric 1" is a metric representing the dataset authors' assessment for the level of interdisciplinarity represented by the paper (integer: “1” indicated social and natural science interdisciplinarity where both social and environmental conditions are measured or explored and/or author affiliations included departments across these disciplines; “2” indicated general interdisciplinarity between two or more different fields (that may both be within natural or social science); and “3” indicated single-disciplinarity) "Citations" is the count of citations the paper had received as of the date listed in "date for cite count", as reported in Google Scholar (integer); "date for cite count" is the date on which citation count for the paper was obtained (ddBBByy); "Abstract" is the text of the abstract of the paper, where available (text); "Notes" are any notes added by the authors of the dataset (text). 
    more » « less
  4. Many publications on COVID-19 were released on preprint servers such as medRxiv and bioRxiv. It is unknown how reliable these preprints are, and which ones will eventually be published in scientific journals. In this study, we use crowdsourced human forecasts to predict publication outcomes and future citation counts for a sample of 400 preprints with high Altmetric score. Most of these preprints were published within 1 year of upload on a preprint server (70%), with a considerable fraction (45%) appearing in a high-impact journal with a journal impact factor of at least 10. On average, the preprints received 162 citations within the first year. We found that forecasters can predict if preprints will be published after 1 year and if the publishing journal has high impact. Forecasts are also informative with respect to Google Scholar citations within 1 year of upload on a preprint server. For both types of assessment, we found statistically significant positive correlations between forecasts and observed outcomes. While the forecasts can help to provide a preliminary assessment of preprints at a faster pace than traditional peer-review, it remains to be investigated if such an assessment is suited to identify methodological problems in preprints. 
    more » « less
  5. Traditional citation analysis methods have been criticized because their theoretical base of statistical counts does not reflect the motive or judgment of citing authors. In particular, self-citations may give undue credits to a cited article or mislead scientific development. This research aims to answer the question of whether self-citation is biased by probing into the motives and context of citations. It takes an integrated and fine-grained view of self-citations by examining them via multiple lenses—polarity, density, and location of citations. In addition, it explores potential moderating effects of citation level and associations among location contexts of citations to the same references for the first time. We analyzed academic publications across different topics and disciplines using both qualitative and quantitative methods. The results provide evidence that self-citations are free of bias in terms of citation density and polarity uncertainty, but they can be biased with respect to positivity and negativity of citations. Furthermore, this study reveals impacts of self-citing behavior on some citation patterns involving citation density, location concentration, and associations. The examination of self-citing behavior from those new perspectives shed new lights on the nature and function of self-citing behavior. 
    more » « less