Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets

Brenskelle, Laura  (ORCID:0000000292848871); Guralnick, Rob P.  (ORCID:0000000166821504); Denslow, Michael  (ORCID:0000000254313542); Stucky, Brian J.

doi:10.1002/aps3.11370

Premise

Digitization and imaging of herbarium specimens provides essential historical phenotypic and phenological information about plants. However, the full use of these resources requires high‐quality human annotations for downstream use. Here we provide guidance on the design and implementation of image annotation projects for botanical research.

Methods and Results

We used a novel gold‐standard data set to test the accuracy of human phenological annotations of herbarium specimen images in two settings: structured, in‐person sessions and an online, community‐science platform. We examined how different factors influenced annotation accuracy and found that botanical expertise, academic career level, and time spent on annotations had little effect on accuracy. Rather, key factors included traits and taxa being scored, the annotation setting, and the individual scorer. In‐person annotations were significantly more accurate than online annotations, but both generated relatively high‐quality outputs. Gathering multiple, independent annotations for each image improved overall accuracy.

Conclusions

Our results provide a best‐practices basis for using human effort to annotate images of plants. We show that scalable community science mechanisms can produce high‐quality data, but care must be taken to choose tractable taxa and phenophases and to provide informative training material.

More Like this