Machine learning allows “the machine” to deduce the complex and sometimes unrecognized rules governing spatial systems, particularly topographic mapping, by exposing it to the end product. Often, the obstacle to this approach is the acquisition of many good and labeled training examples of the desired result. Such is the case with most types of natural features. To address such limitations, this research introduces GeoNat v1.0, a natural feature dataset, used to support artificial intelligence‐based mapping and automated detection of natural features under a supervised learning paradigm. The dataset was created by randomly selecting points from the U.S. Geological Survey’s Geographic Names Information System and includes approximately 200 examples each of 10 classes of natural features. Resulting data were tested in an object‐detection problem using a region‐based convolutional neural network. The object‐detection tests resulted in a 62% mean average precision as baseline results. Major challenges in developing training data in the geospatial domain, such as scale and geographical representativeness, are addressed in this article. We hope that the resulting dataset will be useful for a variety of applications and shed light on training data collection and labeling in the geospatial artificial intelligence domain.
more » « less- PAR ID:
- 10456730
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Transactions in GIS
- Volume:
- 24
- Issue:
- 3
- ISSN:
- 1361-1682
- Page Range / eLocation ID:
- p. 556-572
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Outdoor ambient acoustical environments may be predicted through supervised machine learning using geospatial features as inputs. However, collecting sufficient training data is an expensive process, particularly when attempting to improve the accuracy of models based on supervised learning methods over large, geospatially diverse regions. Unsupervised machine learning methods, such as K-Means clustering analysis, enable a statistical comparison between the geospatial diversity represented in the current training dataset versus the predictor locations. In this case, the geospatial features that represent the regions of western North Carolina and Utah have been simultaneously clustered to examine the common clusters between the two locations. Initial results show that most geospatial clusters group themselves according to a relatively small number of prominent geospatial features, and that Utah requires appreciably more clusters to represent its geospace. Additionally, the training dataset has a relatively low geospatial diversity because most of the current training data sites reside in a small number of clusters. This analysis informs a choice of new site locations for data acquisition that maximize the statistical similarity of the training and input datasets.more » « less
-
null (Ed.)This paper introduces a new GeoAI solution to support automated mapping of global craters on the Mars surface. Traditional crater detection algorithms suffer from the limitation of working only in a semiautomated or multi-stage manner, and most were developed to handle a specific dataset in a small subarea of Mars’ surface, hindering their transferability for global crater detection. As an alternative, we propose a GeoAI solution based on deep learning to tackle this problem effectively. Three innovative features are integrated into our object detection pipeline: (1) a feature pyramid network is leveraged to generate feature maps with rich semantics across multiple object scales; (2) prior geospatial knowledge based on the Hough transform is integrated to enable more accurate localization of potential craters; and (3) a scale-aware classifier is adopted to increase the prediction accuracy of both large and small crater instances. The results show that the proposed strategies bring a significant increase in crater detection performance than the popular Faster R-CNN model. The integration of geospatial domain knowledge into the data-driven analytics moves GeoAI research up to the next level to enable knowledge-driven GeoAI. This research can be applied to a wide variety of object detection and image analysis tasks.more » « less
-
Objective: Rotors, regions of spiral wave reentry in cardiac tissues, are considered as the drivers of atrial fibrillation (AF), the most common arrhythmia. Whereas physics-based approaches have been widely deployed to detect the rotors, in-depth knowledge in cardiac physiology and electrogram interpretation skills are typically needed. The recent leap forward in smart sensing, data acquisition, and Artificial Intelligence (AI) has offered an unprecedented opportunity to transform diagnosis and treatment in cardiac ailment, including AF. This study aims to develop an image-decomposition-enhanced deep learning framework for automatic identification of rotor cores on both simulation and optical mapping data. Methods: We adopt the Ensemble Empirical Mode Decomposition algorithm (EEMD) to decompose the original image, and the most representative component is then fed into a You-Only-Look-Once (YOLO) object-detection architecture for rotor detection. Simulation data from a bi-domain simulation model and optical mapping acquired from isolated rabbit hearts are used for training and validation. Results: This integrated EEMD-YOLO model achieves high accuracy on both simulation and optical mapping data (precision: 97.2%, 96.8%, recall: 93.8%, 92.2%, and F1 score: 95.5%, 94.4%, respectively). Conclusion: The proposed EEMD-YOLO yields comparable accuracy in rotor detection with the gold standard in literature.more » « less
-
Deep learning has become the most popular direction in machine learning and artificial intelligence. However, the preparation of training data, as well as model training, are often time-consuming and become the bottleneck of the end-to-end machine learning lifecycle. Reusing models for inferring a dataset can avoid the costs of retraining. However, when there are multiple candidate models, it is challenging to discover the right model for reuse. Although there exist a number of model-sharing platforms such as ModelDB, TensorFlow Hub, PyTorch Hub, and DLHub, most of these systems require model uploaders to manually specify the details of each model and model downloaders to screen keyword search results for selecting a model. We are lacking a highly productive model search tool that selects models for deployment without the need for any manual inspection and/or labeled data from the target domain. This paper proposes multiple model search strategies including various similarity-based approaches and non-similarity-based approaches. We design, implement and evaluate these approaches on multiple model inference scenarios, including activity recognition, image recognition, text classification, natural language processing, and entity matching. The experimental evaluation showed that our proposed asymmetric similarity-based measurement, adaptivity, outperformed symmetric similarity-based measurements and non-similarity-based measurements in most of the workloads.more » « less
-
This paper assesses trending AI foundation models, especially emerging computer vision foundation models and their performance in natural landscape feature segmentation. While the term foundation model has quickly garnered interest from the geospatial domain, its definition remains vague. Hence, this paper will first introduce AI foundation models and their defining characteristics. Built upon the tremendous success achieved by Large Language Models (LLMs) as the foundation models for language tasks, this paper discusses the challenges of building foundation models for geospatial artificial intelligence (GeoAI) vision tasks. To evaluate the performance of large AI vision models, especially Meta’s Segment Anything Model (SAM), we implemented different instance segmentation pipelines that minimize the changes to SAM to leverage its power as a foundation model. A series of prompt strategies were developed to test SAM’s performance regarding its theoretical upper bound of predictive accuracy, zero-shot performance, and domain adaptability through fine-tuning. The analysis used two permafrost feature datasets, ice-wedge polygons and retrogressive thaw slumps because (1) these landform features are more challenging to segment than man-made features due to their complicated formation mechanisms, diverse forms, and vague boundaries; (2) their presence and changes are important indicators for Arctic warming and climate change. The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping. The spatial and domain generalizability of this finding is further validated using a more general dataset EuroCrops for agricultural field mapping. Finally, we discuss future research directions that strengthen SAM’s applicability in challenging geospatial domains.more » « less