skip to main content

Search for: All records

Award ID contains: 1740693

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. To effectively disseminate location-linked information despite the existence of digital walls across institutions, this study developed a cross-institution mobile App, named GeoFairy2, to overcome the virtual gaps among multi-source datasets and aid the general users to make thorough accurate in-situ decisions. The app provides a one-stop service with relevant information to assist with instant decision making. It was tested and proven to be capable of on-demand coupling and delivering location-based information from multiple sources. The app can help general users to crack down the digital walls among information pools and serve as a one-stop retrieval place for all information. GeoFairy2 was experimented with to gather real-time and historical information about crops, soil, water, and climate. Instead of a one-way data portal, GeoFairy2 allows general users to submit photos and observations to support citizen science projects and derive new insights, and further refine the future service. The two-directional mechanism makes GeoFairy2 a useful mobile gateway to access and contribute to the rapidly growing, heterogeneous, multisource, and location-linked datasets, and pave a way to drive us into a new mobile web with more links and less digital walls across data providers and institutions.
  2. AI (artificial intelligence)-based analysis of geospatial data has gained a lot of attention. Geospatial datasets are multi-dimensional; have spatiotemporal context; exist in disparate formats; and require sophisticated AI workflows that include not only the AI algorithm training and testing, but also data preprocessing and result post-processing. This complexity poses a huge challenge when it comes to full-stack AI workflow management, as researchers often use an assortment of time-intensive manual operations to manage their projects. However, none of the existing workflow management software provides a satisfying solution on hybrid resources, full file access, data flow, code control, and provenance. This paper introduces a new system named Geoweaver to improve the efficiency of full-stack AI workflow management. It supports linking all the preprocessing, AI training and testing, and post-processing steps into a single automated workflow. To demonstrate its utility, we present a use case in which Geoweaver manages end-to-end deep learning for in-time crop mapping using Landsat data. We show how Geoweaver effectively removes the tedium of managing various scripts, code, libraries, Jupyter Notebooks, datasets, servers, and platforms, greatly reducing the time, cost, and effort researchers must spend on such AI-based workflows. The concepts demonstrated through Geoweaver serve as an important buildingmore »block in the future of cyberinfrastructure for AI research.« less
  3. Understanding the past, present, and changing behavior of the climate requires close collaboration of a large number of researchers from many scientific domains. At present, the necessary interdisciplinary collaboration is greatly limited by the difficulties in discovering, sharing, and integrating climatic data due to the tremendously increasing data size. This paper discusses the methods and techniques for solving the inter-related problems encountered when transmitting, processing, and serving metadata for heterogeneous Earth System Observation and Modeling (ESOM) data. A cyberinfrastructure-based solution is proposed to enable effective cataloging and two-step search on big climatic datasets by leveraging state-of-the-art web service technologies and crawling the existing data centers. To validate its feasibility, the big dataset served by UCAR THREDDS Data Server (TDS), which provides Petabyte-level ESOM data and updates hundreds of terabytes of data every day, is used as the case study dataset. A complete workflow is designed to analyze the metadata structure in TDS and create an index for data parameters. A simplified registration model which defines constant information, delimits secondary information, and exploits spatial and temporal coherence in metadata is constructed. The model derives a sampling strategy for a high-performance concurrent web crawler bot which is used to mirror the essentialmore »metadata of the big data archive without overwhelming network and computing resources. The metadata model, crawler, and standard-compliant catalog service form an incremental search cyberinfrastructure, allowing scientists to search the big climatic datasets in near real-time. The proposed approach has been tested on UCAR TDS and the results prove that it achieves its design goal by at least boosting the crawling speed by 10 times and reducing the redundant metadata from 1.85 gigabytes to 2.2 megabytes, which is a significant breakthrough for making the current most non-searchable climate data servers searchable.« less