NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Labeling Poststorm Coastal Imagery for Machine Learning: Measurement of Interrater Agreement

https://doi.org/10.1029/2021EA001896

Goldstein, Evan B.; Buscombe, Daniel; Lazarus, Eli D.; Mohanty, Somya D.; Rafique, Shah Nafis; Anarde, Katherine A.; Ashton, Andrew D.; Beuzen, Tomas; Castagno, Katherine A.; Cohn, Nicholas; et al (September 2021, Earth and Space Science)

Abstract Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data‐driven models are only as good as the data used for training, and this points to the importance of high‐quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time‐consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.
more » « less
A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments

https://doi.org/10.1038/s41597-023-01929-2

Buscombe, Daniel; Wernette, Phillipe; Fitzpatrick, Sharon; Favela, Jaycee; Goldstein, Evan B.; Enwright, Nicholas M. (December 2023, Scientific Data)

Abstract The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images.
more » « less
Full Text Available
A Reproducible and Reusable Pipeline for Segmentation of Geoscientific Imagery

https://doi.org/10.1029/2022EA002332

Buscombe, D.; Goldstein, E. B. (September 2022, Earth and Space Science)

Full Text Available
Human‐in‐the‐Loop Segmentation of Earth Surface Imagery

https://doi.org/10.1029/2021EA002085

Buscombe, D.; Goldstein, E. B.; Sherwood, C. R.; Bodine, C.; Brown, J. A.; Favela, J.; Fitzpatrick, S.; Kranenburg, C. J.; Over, J. R.; Ritchie, A. C.; et al (March 2022, Earth and Space Science)

Full Text Available
An Active Learning Pipeline to Detect Hurricane Washover in Post-Storm Aerial Images

https://doi.org/10.31223/X5JW23

Goldstein EB, Mohanty SD (December 2020, AI for Earth Sciences Workshop at NeurIPS 2020)

We present an active learning pipeline to identify hurricane impacts on coastal landscapes. Previously unlabeled post-storm images are used in a three component workflow — first an online interface is used to crowd-source labels for imagery; second, a convolutional neural network is trained using the labeled images; third, model predictions are displayed on an interactive map. Both the labeler and interactive map allow coastal scientists to provide additional labels that will be used to develop a large labeled dataset, a refined model, and improved hurricane impact assessments.
more » « less
Full Text Available
psi-collect: A Python module for post-storm image collection and cataloging

https://doi.org/10.21105/joss.02075

Moretz, Matthew; Foster, Daniel; Weber, John; Chowdhury, Rinty; Rafique, Shah; Goldstein, Evan; Mohanty, Somya (March 2020, Journal of Open Source Software)

Full Text Available

Search for: All records