skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Legalize text recycling
Key points Text recycling is the reuse of material from an author's own prior work in a new document. While the ethical aspects of text recycling have received considerable attention, the legal aspects have been largely ignored or inaccurately portrayed. Copyright laws and publisher contracts are difficult to interpret and highly variable, making it difficult for authors or editors to know when text recycling in research writing is legal or illegal. We argue that publishers should revise their author contracts to make text recycling explicitly legal as long as authors follow ethics‐based guidelines.  more » « less
Award ID(s):
1737093
PAR ID:
10454765
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Learned Publishing
Volume:
36
Issue:
3
ISSN:
0953-1513
Page Range / eLocation ID:
473 to 476
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Schelble, Susan M; Elkins, Kelly M (Ed.)
    Like most scientists, chemists frequently have reason to reuse some materials from their own published articles in new ones, especially when producing a series of closely related papers. Text recycling, the reuse of material from one’s own works, has become a source of considerable confusion and frustration for researchers and editors alike. While text recycling does not pose the same level of ethical concern as matters such as data fabrication or plagiarism, it is much more common and complicated. Much of the confusion stems from a lack of clarity and consistency in publisher guidelines and publishing contracts. Matters are even more complicated when manuscripts are coauthored by researchers residing in different countries. This chapter demonstrates the nature of these problems through an analysis of a set of documents from a single publisher, the American Chemical Society (ACS). The ACS was chosen because it is a leading publisher of chemistry research and because its guidelines and publishing contracts address text recycling in unusual detail. The present analysis takes advantage of this detail to show both the importance of clear, thoughtfully designed text recycling policies and the problems that can arise when publishers fail to bring their various documents into close alignment. 
    more » « less
  2. Because science advances incrementally, scientists often need to repeat material included in their prior work when composing new texts. Such “text recycling” is a common but complex writing practice, so authors and editors need clear and consistent guidance about what constitutes appropriate practice. Unfortunately, publishers’ policies on text recycling to date have been incomplete, unclear, and sometimes internally inconsistent. Building on 4 years of research on text recycling in scientific writing, the Text Recycling Research Project has developed a model text recycling policy that should be widely applicable for research publications in scientific fields. This article lays out the challenges text recycling poses for editors and authors, describes key factors that were addressed in developing the policy, and explains the policy’s main features. 
    more » « less
  3. When writing journal articles, STEM researchers produce a number of other genres such as grant proposals and conference posters, and their articles routinely build directly on their own prior work. As a result, STEM authors often reuse material from their completed documents in producing new documents. While this practice, known as text recycling (or self-plagiarism), is a debated issue in publishing and research ethics, little is known about researchers’ beliefs about what constitutes appropriate practice. This article presents results of from an exploratory, survey-based study on beliefs and attitudes toward text recycling among STEM “experts” (faculty researchers) and “novices” (graduate students and post docs). While expert and novice researchers are fairly consistent in distinguishing between text recycling and plagiarism, there is considerable disagreement about appropriate text recycling practice. 
    more » « less
  4. Legal texts routinely use concepts that are difficult to understand. Lawyers elaborate on the meaning of such concepts by, among other things, carefully investigating how they have been used in the past. Finding text snippets that mention a particular concept in a useful way is tedious, time-consuming, and hence expensive. We assembled a data set of 26,959 sentences, coming from legal case decisions, and labeled them in terms of their usefulness for explaining selected legal concepts. Using the dataset we study the effectiveness of transformer models pre-trained on large language corpora to detect which of the sentences are useful. In light of models{'} predictions, we analyze various linguistic properties of the explanatory sentences as well as their relationship to the legal concept that needs to be explained. We show that the transformer-based models are capable of learning surprisingly sophisticated features and outperform the prior approaches to the task. 
    more » « less
  5. Background: Text recycling (hereafter TR)—the reuse of one’s own textual materials from one document in a new document—is a common but hotly debated and unsettled practice in many academic disciplines, especially in the context of peer-reviewed journal articles. Although several analytic systems have been used to determine replication of text—for example, for purposes of identifying plagiarism—they do not offer an optimal way to compare documents to determine the nature and extent of TR in order to study and theorize this as a practice in different disciplines. In this article, we first describe TR as a common phenomenon in academic publishing, then explore the challenges associated with trying to study the nature and extent of TR within STEM disciplines. We then describe in detail the complex processes we used to create a system for identifying TR across large corpora of texts, and the sentence-level string-distance lexical methods used to refine and test the system (White & Joy, 2004). The purpose of creating such a system is to identify legitimate cases of TR across large corpora of academic texts in different fields of study, allowing meaningful cross-disciplinary comparisons in future analyses of published work. The findings from such investigations will extend and refine our understanding of discourse practices in academic and scientific settings. Literature Review: Text-analytic methods have been widely developed and implemented to identify reused textual materials for detecting plagiarism, and there is considerable literature on such methods. (Instead of taking up space detailing this literature, we point readers to several recent reviews: Gupta, 2016; Hiremath & Otari, 2014; and Meuschke & Gipp, 2013). Such methods include fingerprinting, term occurrence analysis, citation analysis (identifying similarity in references and citations), and stylometry (statistically comparing authors’ writing styles; see Meuschke & Gipp, 2013). Although TR occurs in a wide range of situations, recent debate has focused on recycling from one published research paper to another—particularly in STEM fields (see, for example, Andreescu, 2013; Bouville, 2008; Bretag & Mahmud, 2009; Roig, 2008; Scanlon, 2007). An important step in better understanding the practice is seeing how authors actually recycle material in their published work. Standard methods for detecting plagiarism are not directly suitable for this task, as the objective is not to determine the presence or absence of reuse itself, but to study the types and patterns of reuse, including materials that are syntactically but not substantively distinct—such as “patchwriting” (Howard, 1999). In the present account of our efforts to create a text-analytic system for determining TR, we take a conventional alphabetic approach to text, in part because we did not aim at this stage of our project to analyze non-discursive text such as images or other media. However, although the project adheres to conventional definitions of text, with a focus on lexical replication, we also subscribe to context-sensitive approaches to text production. The results of applying the system to large corpora of published texts can potentially reveal varieties in the practice of TR as a function of different discourse communities and disciplines. Writers’ decisions within what appear to be canonical genres are contingent, based on adherence to or deviation from existing rules and procedures if and when these actually exist. Our goal is to create a system for analyzing TR in groups of texts produced by the same authors in order to determine the nature and extent of TR, especially across disciplinary areas, without judgment of scholars’ use of the practice. 
    more » « less