Abstract Many have argued that datasets resulting from scientific research should be part of the scholarly record as first class research products. Data sharing mandates from funding agencies and scientific journal publishers along with calls from the scientific community to better support transparency and reproducibility of scientific research have increased demand for tools and support for publishing datasets. Hydrology domain‐specific data publication services have been developed alongside more general purpose and even commercial data repositories. Prominent among these are the Hydrologic Information System (HIS) and HydroShare repositories developed by the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI). More broadly, however, multiple organizations have been involved in the practice of data publication in the hydrology domain, each having different roles that have shaped data publication and reuse. Bibliographic and archival approaches to data publication have been advanced, but both have limitations with respect to hydrologic data. Specific recommendations for improving data publication infrastructure, support, and practices to move beyond existing limitations and enable more effective data publication in support of scientific research in the hydrology domain include: improving support for journal article‐based data access and data citation, considering the workflow for data publication, enhancing support for reproducible science, encouraging publication of curated reference data collections, advancing interoperability standards for sharing data and metadata among repositories, developing partnerships with university libraries offering data services, and developing more specific data management plans. While presented in the context of CUAHSI's data repositories and experience, these recommendations are broadly applicable to other domains. This article is categorized under:Science of Water > Methods
more »
« less
A dataset of publication records for Nobel laureates
Abstract A central question in the science of science concerns how to develop a quantitative understanding of the evolution and impact of individual careers. Over the course of history, a relatively small fraction of individuals have made disproportionate, profound, and lasting impacts on science and society. Despite a long-standing interest in the careers of scientific elites across diverse disciplines, it remains difficult to collect large-scale career histories that could serve as training sets for systematic empirical and theoretical studies. Here, by combining unstructured data collected from CVs, university websites, and Wikipedia, together with the publication and citation database from Microsoft Academic Graph (MAG), we reconstructed publication histories of nearly all Nobel prize winners from the past century, through both manual curation and algorithmic disambiguation procedures. Data validation shows that the collected dataset presents among the most comprehensive collection of publication records for Nobel laureates currently available. As our quantitative understanding of science deepens, this dataset is expected to have increasing value. It will not only allow us to quantitatively probe novel patterns of productivity, collaboration, and impact governing successful scientific careers, it may also help us unearth the fundamental principles underlying creativity and the genesis of scientific breakthroughs.
more »
« less
- Award ID(s):
- 1829344
- PAR ID:
- 10153819
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Scientific Data
- Volume:
- 6
- Issue:
- 1
- ISSN:
- 2052-4463
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In a typical science class, communication exercises may include a variety of outputs including lab reports, posters, reflective writing, or research proposals. However, a growing number of students are engaging in more complex and professional communication endeavors, including scientific publication. The chance to write a research paper and experience the peer-review and publication processes may provide students the opportunity to integrate several practices from the Next Generation Science Standards, as well as share their research in a more public setting. Although we have some limited understanding in terms of the outcomes that students experience when engaging in peer-review and publication of their science research papers, we have no information or data regarding why students want to participate in these processes. As such, the purpose of this study is to investigate the motivations of pre-college students to pursue peer-review and publication of their scientific research papers. Using the theory of science identity to analyze the data, I found that students view publication as a mechanism to grow their scientific skills and be recognized as a scientist. The findings suggest that providing students the opportunity to share their research in more public settings could be a factor in developing their science identity.more » « less
-
Data are becoming increasingly important in science and society, and thus data literacy is a vital asset to students as they prepare for careers in and outside science, technology, engineering, and mathematics and go on to lead productive lives. In this paper, we discuss why the strongest learning experiences surrounding data literacy may arise when students are given opportunities to work with authentic data from scientific research. First, we explore the overlap between the fields of quantitative reasoning, data science, and data literacy, specifically focusing on how data literacy results from practicing quantitative reasoning and data science in the context of authentic data. Next, we identify and describe features that influence the complexity of authentic data sets (selection, curation, scope, size, and messiness) and implications for data-literacy instruction. Finally, we discuss areas for future research with the aim of identifying the impact that authentic data may have on student learning. These include defining desired learning outcomes surrounding data use in the classroom and identification of teaching best practices when using data in the classroom to develop students’ data-literacy abilities.more » « less
-
Abstract Mentorship in science is crucial for topic choice, career decisions, and the success of mentees and mentors. Typically, researchers who study mentorship use article co-authorship and doctoral dissertation datasets. However, available datasets of this type focus on narrow selections of fields and miss out on early career and non-publication-related interactions. Here, we describe Mentorship, a crowdsourced dataset of 743176 mentorship relationships among 738989 scientists primarily in biosciences that avoids these shortcomings. Our dataset enriches the Academic Family Tree project by adding publication data from the Microsoft Academic Graph and “semantic” representations of research using deep learning content analysis. Because gender and race have become critical dimensions when analyzing mentorship and disparities in science, we also provide estimations of these factors. We perform extensive validations of the profile–publication matching, semantic content, and demographic inferences, which mostly cover neuroscience and biomedical sciences. We anticipate this dataset will spur the study of mentorship in science and deepen our understanding of its role in scientists’ career outcomes.more » « less
-
Societal Impact StatementIt is important to recognize how our current understanding of plants has been shaped by diverse cultural contexts, as this underscores the importance of valuing and incorporating contributions from all knowledge systems in scientific pursuits. This approach emphasizes the ongoing bias, including within scientific practices, and the necessity of discussing problematic histories within spaces of learning. It is crucial to acknowledge and address biases, even within scientific endeavors. Doing so fosters a more inclusive and equitable scientific community. This article, while not comprehensive, serves as a starting point for conversation and an introduction to current work on these topics. SummaryIn response to a global dialog about systemic racism, ongoing inequalities, appeals to decolonize science, and the many recent calls for diversity, equity, accessibility, and inclusion, we draw on the narratives of plants to revisit the history of botany. Our goal is to uncover how exclusionary practices have functioned in the past and persist today. We also explore the numerous opportunities and challenges that arise in the era of information as we strive to establish a more inclusive field of botany. This approach recognizes and honors the contributions of historically marginalized groups, such as Black and Indigenous communities. We hope that this article can serve as a catalyst for raising awareness, fostering contemplation, and driving action toward a more equitable and just scientific community.more » « less
An official website of the United States government
