skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2102126

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data‐driven models are only as good as the data used for training, and this points to the importance of high‐quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time‐consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research. 
    more » « less
  2. The growing push for open data resulted in an abundance of data for coastal researchers, which can lead to problems for individual researchers related to data discoverability. One solution is to explicitly develop services for coastal researchers to help curate data for discovery, hosting discussions around reuse, community building, and finding collaborators. To develop the idea of a coastal data curation service, we investigate aspects of the UNESCO International Coastal Atlas Network member sites that could be used to build a curation service. We develop a minimal example of a coastal data curation service, deploy this as a website, and describe the next steps to move beyond the prototype phase. We envision a coastal data curation service as a way to cultivate a community focused on coastal data discovery and reuse. 
    more » « less
  3. We present an active learning pipeline to identify hurricane impacts on coastal landscapes. Previously unlabeled post-storm images are used in a three component workflow — first an online interface is used to crowd-source labels for imagery; second, a convolutional neural network is trained using the labeled images; third, model predictions are displayed on an interactive map. Both the labeler and interactive map allow coastal scientists to provide additional labels that will be used to develop a large labeled dataset, a refined model, and improved hurricane impact assessments. 
    more » « less