Modern science depends on computers, but not all scientists have access to the scale of computation they need. A digital divide separates scientists who accelerate their science using large cyberinfrastructure from those who do not, or who do not have access to the compute resources or learning opportunities to develop the skills needed. The exclusionary nature of the digital divide threatens equity and the future of innovation by leaving people out of the scientific process while over-amplifying the voices of a small group who have resources. However, there are potential solutions: recent advancements in public research cyberinfrastructure and resources developed during the open science revolution are providing tools that can help bridge this divide. These tools can enable access to fast and powerful computation with modest internet connections and personal computers. Here we contribute another resource for narrowing the digital divide: scalable virtual machines running on public cloud infrastructure. We describe the tools, infrastructure, and methods that enabled successful deployment of a reproducible and scalable cyberinfrastructure architecture for a collaborative data synthesis working group in February 2023. This platform enabled 45 scientists with varying data and compute skills to leverage 40,000 hours of compute time over a 4-day workshop. Our approach provides an open framework that can be replicated for educational and collaborative data synthesis experiences in any data- and compute-intensive discipline.
more »
« less
Zero to a trillion: Advancing surface process studies with open access to high resolution topography
High-resolution topography (HRT) is a powerful observational tool for studying the Earth's surface, vegetation, and urban landscapes, with broad scientific, engineering, and education-based applications. Submeter resolution imaging is possible when collected with laser and photogrammetric techniques using the ground, air, and space-based platforms. Open access to these data and a cyberinfrastructure platform that enables users to discover, manage, share, and process then increases the impact of investments in data collection and catalyzes scientific discovery. Furthermore, open and online access to data enables broad interdisciplinary use of HRT across academia and in communities such as education, public agencies, and the commercial sector. OpenTopography, supported by the US National Science Foundation, aims to democratize access to Earth science-oriented, HRT data and processing tools. We utilize cyberinfrastructure, including large-scale data management, high-performance computing, and service-oriented architectures to provide efficient web-based visualization and access to large, HRT datasets. OT colocates data with processing tools to enable users to quickly access custom data and derived products for their application, with the ultimate goal of making these powerful data easier to use. OT's rapidly growing data holdings currently include 283 lidar and photogrammetric, point cloud datasets (>1.2 trillion points) covering 236,364km2. As a testament to OT's success, more than 86,000 users have processed over 5 trillion lidar points. This use has resulted in more than 290 peer-reviewed publications across numerous academic domains including Earth science, geography, computer science, and ecology.
more »
« less
- Award ID(s):
- 1833632
- PAR ID:
- 10387368
- Editor(s):
- Tarolli, P.; Mudd, S.
- Date Published:
- Journal Name:
- Developments in earth surface processes
- Volume:
- 23
- ISSN:
- 0928-2025
- Page Range / eLocation ID:
- 317-338
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Scientific workflow management systems (WfMS) provide a systematic way to streamline necessary processes in scientific research. The demand for FAIR (Findable, Accessible, Interoperable, and Reusable) workflows is increasing in the scientific community, particularly in GIScience, where data is not just an output but an integral part of iterative advanced processes. Traditional WfMS often lack the capability to ensure geospatial data and process transparency, leading to challenges in reproducibility and replicability of research findings. This paper proposes the conceptualization and development of FAIR-oriented GIScience WfMS, aiming to incorporate the FAIR principles into the entire lifecycle of geospatial data processing and analysis. To enhance the findability and accessibility of workflows, the WfMS utilizes Harvard Dataverse to share all workflow-related digital resources, organized into workflow datasets, nodes, and case studies. Each resource is assigned a unique DOI (Digital Object Identifier), ensuring easy access and discovery. More importantly, the WfMS complies with the Common Workflow Language (CWL) standard to guarantee interoperability and reproducibility of workflows. It also enables the integration of diverse tools and software, supporting complex analyses that require multiple processing steps. This paper demonstrates the prototype of the GIScience WfMS and illustrates two geospatial science case studies, reflecting its flexibility in selecting appropriate techniques for various datasets and research goals. The user-friendly workflow designer makes it accessible to users with different levels of technical expertise, promoting reusable, reproducible, and replicable GIScience studies.more » « less
-
Abstract. Global change research demands a convergence among academic disciplines to understand complex changes in Earth system function. Limitations related to data usability and computing infrastructure, however, present barriers to effective use of the research tools needed for this cross-disciplinary collaboration. To address these barriers, we created a computational platform that pairs meteorological data and site-level ecosystem characterizations from the National Ecological Observatory Network (NEON) with the Community Terrestrial System Model (CTSM) that is developed with university partners at the National Center for Atmospheric Research (NCAR). This NCAR–NEON system features a simplified user interface that facilitates access to and use of NEON observations and NCAR models. We present preliminary results that compare observed NEON fluxes with CTSM simulations and describe how the collaboration between NCAR and NEON that can be used by the global change research community improves both the data and model. Beyond datasets and computing, the NCAR–NEON system includes tutorials and visualization tools that facilitate interaction with observational and model datasets and further enable opportunities for teaching and research. By expanding access to data, models, and computing, cyberinfrastructure tools like the NCAR–NEON system will accelerate integration across ecology and climate science disciplines to advance understanding in Earth system science and global change.more » « less
-
null (Ed.)Scientific data, its analysis, accuracy, completeness, and reproducibility play a vital role in advancing science and engineering. Open Science Chain (OSC) is a cyberinfrastructure platform built using the Hyperledger Fabric (HLF) blockchain technology to address issues related to data reproducibility and accountability in scientific research. OSC preserves the integrity of research datasets and enables different research groups to share datasets with the integrity information. Additionally, it enables quick verification of the exact datasets that were used for a particular published research and tracks its provenance. In this paper, we describe OSC’s command line utility that will preserve the integrity of research datasets from within the researchers’ environment or from remote systems such as HPC resources or campus clusters used for research. The Python-based command line utility can be seamlessly integrated within research workflows and provides an easy way to preserve the integrity of research data in OSC blockchain platform.more » « less
-
ABSTRACT River morphology data are critical for understanding and studying river processes and for managing rivers for multiple socio‐economic uses. While such data have been extensively acquired, several issues hinder their use such as data accessibility, various data formats, lack of data models for storage, and lack of processing tools to assemble data in products readily usable for research, management, and education. A multi‐university research team has prototyped a web‐based river morphology information system (RIMORPHIS) for hosting and creating new information (e.g., terrain and material composition data) and data processing tools for the broader earth science communities. The RIMORPHIS design principles include: (i) broad access via a publicly and freely available platform‐independent system; (ii) flexibility in handling existing and future data types; (iii) user‐friendly and interactive interfaces; and (iv) interoperability and scalability to ensure platform sustainability. Developing such an ambitious community resource is only possible and impactful by continuously engaging stakeholders from the project inception. This paper highlights the research team's strategy and activities to engage with river morphology data producers and potential users from academia, research, and practice. The paper also details outcomes of stakeholder engagement and illustrates how these interactions are positively shaping RIMORPHIS development and its path to long‐term sustainability.more » « less
An official website of the United States government

