Biology today is heavily data-driven and knowledge-centric that are stored across the linked open web in numerous heterogeneous deep web databases. To improve searching, finding, accessing, and inter-operating among these diverse information sources to increase usability, the FAIR data principle has been proposed. Unfortunately, FAIR compliance is extremely low and linked open data does not guarantee FAIRness, leaving biologists on a solo hunt for information on the open network. In this paper, we propose {\em SoDa}, for intelligent data foraging on the internet. SoDa helps biologists discover resources based on analysis requirements, generate resource access plans, and store cleaned data and knowledge for community use. A secondary search index is also supported for community members to find archived information conveniently.
more »
« less
Primary Sources as Linked Data: Exploring Motives Across the Sciences and Social Sciences
ABSTRACT While long recognized in the humanities, there is growing recognition in the sciences and social sciences that primary sources—as diverse as manuscripts, photographs, cultural belongings, and specimens—hold vast data about scientific and human knowledge for use in scholarship, community research, and global knowledge. Yet, data embedded in these sources are largely disconnected from the systems of discovery, access, and structured data that support reuse and insights across globally dispersed repositories. In this paper, we share select findings of a systematic review to explore the use of primary sources, and the data embedded in them, via linked data across the sciences and social sciences. Our results confirm the use of a variety of primary source data across diverse disciplines, particularly those requiring longitudinal studies and data integration from diverse repositories and contexts. We highlight how linked data are understood to: connect collections to communities; support highly granular credit, attribution, and assessment of impact; and interrelate diverse sources of knowledge. While these results suggest the value of linked data for the specific research needs of anthropology, the effectiveness of linked data in achieving these objectives and the suitability of this approach for a diversity of institutions and communities need further study.
more »
« less
- Award ID(s):
- 2314762
- PAR ID:
- 10607892
- Publisher / Repository:
- Proceedings of the Association for Information Science and Technology
- Date Published:
- Journal Name:
- Proceedings of the Association for Information Science and Technology
- Volume:
- 61
- Issue:
- 1
- ISSN:
- 2373-9231
- Page Range / eLocation ID:
- 232 to 245
- Subject(s) / Keyword(s):
- Data reuse anthropology archives primary sources science social science linked data
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Machine learning (ML) provides a powerful framework for the analysis of high‐dimensional datasets by modelling complex relationships, often encountered in modern data with many variables, cases and potentially non‐linear effects. The impact of ML methods on research and practical applications in the educational sciences is still limited, but continuously grows, as larger and more complex datasets become available through massive open online courses (MOOCs) and large‐scale investigations. The educational sciences are at a crucial pivot point, because of the anticipated impact ML methods hold for the field. To provide educational researchers with an elaborate introduction to the topic, we provide an instructional summary of the opportunities and challenges of ML for the educational sciences, show how a look at related disciplines can help learning from their experiences, and argue for a philosophical shift in model evaluation. We demonstrate how the overall quality of data analysis in educational research can benefit from these methods and show how ML can play a decisive role in the validation of empirical models. Specifically, we (1) provide an overview of the types of data suitable for ML and (2) give practical advice for the application of ML methods. In each section, we provide analytical examples and reproducible R code. Also, we provide an extensive Appendix on ML‐based applications for education. This instructional summary will help educational scientists and practitioners to prepare for the promises and threats that come with the shift towards digitisation and large‐scale assessment in education. Context and implicationsRationale for this studyIn 2020, the worldwide SARS‐COV‐2 pandemic forced the educational sciences to perform a rapid paradigm shift with classrooms going online around the world—a hardly novel but now strongly catalysed development. In the context of data‐driven education, this paper demonstrates that the widespread adoption of machine learning techniques is central for the educational sciences and shows how these methods will become crucial tools in the collection and analysis of data and in concrete educational applications. Helping to leverage the opportunities and to avoid the common pitfalls of machine learning, this paper provides educators with the theoretical, conceptual and practical essentials.Why the new findings matterThe process of teaching and learning is complex, multifaceted and dynamic. This paper contributes a seminal resource to highlight the digitisation of the educational sciences by demonstrating how new machine learning methods can be effectively and reliably used in research, education and practical application.Implications for educational researchers and policy makersThe progressing digitisation of societies around the globe and the impact of the SARS‐COV‐2 pandemic have highlighted the vulnerabilities and shortcomings of educational systems. These developments have shown the necessity to provide effective educational processes that can support sometimes overwhelmed teachers to digitally impart knowledge on the plan of many governments and policy makers. Educational scientists, corporate partners and stakeholders can make use of machine learning techniques to develop advanced, scalable educational processes that account for individual needs of learners and that can complement and support existing learning infrastructure. The proper use of machine learning methods can contribute essential applications to the educational sciences, such as (semi‐)automated assessments, algorithmic‐grading, personalised feedback and adaptive learning approaches. However, these promises are strongly tied to an at least basic understanding of the concepts of machine learning and a degree of data literacy, which has to become the standard in education and the educational sciences.Demonstrating both the promises and the challenges that are inherent to the collection and the analysis of large educational data with machine learning, this paper covers the essential topics that their application requires and provides easy‐to‐follow resources and code to facilitate the process of adoption.more » « less
-
null (Ed.)Successful management and mitigation of marine challenges depends on cooperation and knowledge sharing which often occurs across culturally diverse geographic regions. Global ocean science collaboration is therefore essential for developing global solutions. Building effective global research networks that can enable collaboration also need to ensure inter- and transdisciplinary research approaches to tackle complex marine socio-ecological challenges. To understand the contribution of interdisciplinary global research networks to solving these complex challenges, we use the Integrated Marine Biosphere Research (IMBeR) project as a case study. We investigated the diversity and characteristics of 1,827 scientists from 11 global regions who were attendees at different IMBeR global science engagement opportunities since 2009. We also determined the role of social science engagement in natural science based regional programmes (using key informants) and identified the potential for enhanced collaboration in the future. Event attendees were predominantly from western Europe, North America, and East Asia. But overall, in the global network, there was growing participation by females, students and early career researchers, and social scientists, thus assisting in moving toward interdisciplinarity in IMBeR research. The mainly natural science oriented regional programmes showed mixed success in engaging and collaborating with social scientists. This was mostly attributed to the largely natural science (i.e., biological, physical) goals and agendas of the programmes, and the lack of institutional support and push to initiate connections with social science. Recognising that social science research may not be relevant to all the aims and activities of all regional programmes, all researchers however, recognised the (potential) benefits of interdisciplinarity, which included broadening scientists’ understanding and perspectives, developing connections and interlinkages, and making science more useful. Pathways to achieve progress in regional programmes fell into four groups: specific funding, events to come together, within-programme-reflections, and social science champions. Future research programmes should have a strategic plan to be truly interdisciplinary, engaging natural and social sciences, as well as aiding early career professionals to actively engage in such programmes.more » « less
-
Abstract Extreme weather events, such as hurricanes with intense rainfall and storm surges, are posing increasing challenges to local communities worldwide. These hazards not only result in substantial property damage but also lead to significant population displacement. Federal disaster assistance programs are crucial for providing financial support for disaster response and recovery, but the allocation of these resources often unequal due to the complex interplay of environmental, social, and institutional factors. Relying on datasets collected from diverse sources, this study employs a structural equation model to explore the complex relationships between disaster damage (DD), social vulnerability (SV), public disaster assistance (PDA), the national flood insurance (NFI), and population migration (PM) across counties in the contiguous US. Our findings reveal that communities with lower SV tend to experience higher levels of DD across US counties. SV is negatively associated with PM, PDA, and NFI, both directly and indirectly. Furthermore, PDA is positively linked to PM, whereas DD has a direct negative effect on PM but an indirect positive effect through PDA.more » « less
-
This material is primarily based upon work supported by the National Science Foundation Graduate Research Fellowship (grant no. DGE-1321845). Addressing complex social-ecological issues requires all relevant sources of knowledge and data, especially those held by communities who remain close to the land. Centuries of oppression, extractive research practices, and misrepresentation have hindered balanced knowledge exchange with Indigenous communities and inhibited innovation and problem-solving capacity in all scientific fields. A recent shift in the research landscape reflects a growing interest in engaging across diverse communities and ways of knowing. Scientific discussions increasingly highlight the inherent value of Indigenous environmental ethics frameworks and processes as the original roadmaps for sustainable development planning, including their potential in addressing the climate crisis and related social and environmental concerns. Momentum in this shift is also propelled by an increasing body of research evidencing the role of Indigenous land stewardship for maintaining ecological health and biodiversity. However, a key challenge straining this movement lies rooted in colonial residue and ongoing actions that suppress and co-opt Indigenous knowledge systems. Scientists working with incomplete datasets privilege a handful of narratives, conceptual understandings, languages, and historical contexts, while failing to engage thousands of collective bodies of intergenerational, place-based knowledge systems. The current dominant colonial paradigm in scientific research risks continued harmful impacts to Indigenous communities that sustain diverse knowledge systems. Here, we outline how ethical standards in researcher practice can be raised in order to reconcile colonial legacies and ongoing settler colonial practices. We synthesize across Indigenous and community-based research protocols and frameworks, transferring knowledge across disciplines, and ground truthing methods and processes in our own practice, to present a relational science working model for supporting Indigenous rights and reconciliation in research. We maintain that core Indigenous values of integrity, respect, humility, and reciprocity should shape researcher responsibilities and methods applied in order to raise ethical standards and long-term relational accountability regarding Indigenous lands, rights, communities, and our shared futures.more » « less
An official website of the United States government

