Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species

Leipzig, J.; Bakis, Y; Wang, X; Elhamod, M.; Diamond, K.; Dahdul, W; Karpante, A; Maga, M; Mabee, P; Bart, H; Greenberg, J.

Citation Details

Biodiversity image repositories are crucial sources of training data for machine learning approaches to biological research. Metadata, specifically metadata about object quality, is putatively an important prerequisite to selecting sample subsets for these experiments. This study demonstrates the importance of image quality metadata to a species classification experiment involving a corpus of 1935 fish specimen images which were annotated with 22 metadata quality properties. A small subset of high quality images produced an F1 accuracy of 0.41 compared to 0.35 for a taxonomically matched subset of low quality images when used by a convolutional neural network approach to species identification. Using the full corpus of images revealed that image quality differed between correctly classified and misclassified images. We found the visibility of all anatomical features was the most important quality feature for classification accuracy. We suggest biodiversity image repositories consider adopting a minimal set of image quality metadata to support future machine learning projects. more »

Award ID(s):: 1940233

PAR ID:: 10298055

Author(s) / Creator(s):: Leipzig, J.; Bakis, Y; Wang, X; Elhamod, M.; Diamond, K.; Dahdul, W; Karpante, A; Maga, M; Mabee, P; Bart, H; Greenberg, J.

Date Published:: 2021-03-18

Journal Name:: Metadata and Semantic Research. MTSR 2020. Communications in Computer and Information Science

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this