Flukebook: an open-source AI platform for cetacean photo identification

Blount, Drew (ORCID:0000000230872702); Gero, Shane (ORCID:000000016854044X); Van Oast, Jon; Parham, Jason (ORCID:0000000188814735); Kingen, Colin; Scheiner, Ben; Stere, Tanya; Fisher, Mark (ORCID:0000000271387025); Minton, Gianna (ORCID:0000000342842540); Khan, Christin (ORCID:0000000196085632); Dulau, Violaine (ORCID:0000000222283959); Thompson, Jaime (ORCID:0000000300028292); Moskvyak, Olga (ORCID:0000000188218319); Berger-Wolf, Tanya (ORCID:0000000176101412); Stewart, Charles V. (ORCID:0000000165326675); Holmberg, Jason (ORCID:0000000271296705); Levenson, J. Jacob (ORCID:0000000271692775)

doi:10.1007/s42991-021-00221-3

Abstract

Determining which species are at greatest risk, where they are most vulnerable, and what are the trajectories of their communities and populations is critical for conservation and management. Globally distributed, wide-ranging whales and dolphins present a particular challenge in data collection because no single research team can record data over biologically meaningful areas. Flukebook.org is an open-source web platform that addresses these gaps by providing researchers with the latest computational tools. It integrates photo-identification algorithms with data management, sharing, and privacy infrastructure for whale and dolphin research, enabling the global collaborative study of these global species. With seven automatic identification algorithms trained for 15 different species, resulting in 37 species-specific identification pipelines, Flukebook is an extensible foundation that continually incorporates emerging AI techniques and applies them to cetacean photo identification through continued collaboration between computer vision researchers, software engineers, and biologists. With over 2.0 million photos of over 52,000 identified individual animals submitted by over 250 researchers, the platform enables a comprehensive understanding of cetacean populations, fostering international and cross-institutional collaboration while respecting data ownership and privacy. We outline the technology stack and architecture of Flukebook, its performance on real-world cetacean imagery, and its development as an example of scalable, extensible, and reusable open-source conservation software. Flukebook is a step change in our ability to conduct large-scale research on cetaceans across biologically meaningful geographic ranges, to rapidly iterate population assessments and abundance trajectories, and engage the public in actions to protect them.

More Like this