skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Human Supervision is Key to Achieving Accurate AI-assisted Wildlife Identifications in Camera Trap Images
Using public support to extract information from vast datasets has become a popular method for accurately labeling wildlife data in camera trap (CT) images. However, the increasing demand for volunteer effort lengthens the time interval between data collection and our ability to draw ecological inferences or perform data-driven conservation actions. Artificial intelligence (AI) approaches are currently highly effective for species detection (i.e., whether an image contains animals or not) and labeling common species; however, it performs poorly on species rarely captured in images and those that are highly visually similar to one another. To capitalize on the best of human and AI classifying methods, we developed an integrated CT data pipeline in which AI provides an initial pass on labeling images, but is supervised and validated by humans (i.e., a “human-in-the-loop” approach). To assess classification accuracy gains, we compare the precision of species labels produced by AI and HITL protocols to a “gold standard” (GS) dataset annotated by wildlife experts. The accuracy of the AI method was species-dependent and positively correlated with the number of training images. The combined efforts of HITL led to error rates of less than 10% for 73% of the dataset and lowered the error rates for an additional 23%. For two visually similar species, human input resulted in higher error rates than AI. While integrating humans in the loop increases classification times relative to AI alone, the gains in accuracy suggest that this method is highly valuable for high-volume CT surveys.  more » « less
Award ID(s):
2006894
PAR ID:
10569043
Author(s) / Creator(s):
; ;
Editor(s):
Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa
Publisher / Repository:
Ubiquity Press
Date Published:
Journal Name:
Citizen Science: Theory and Practice
Volume:
9
Issue:
1
ISSN:
2057-4991
Subject(s) / Keyword(s):
Citizen Science conservation ecology
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Camera traps - remote cameras that capture images of passing wildlife - have become a ubiquitous tool in ecology and conservation. Systematic camera trap surveys generate ‘Big Data’ across broad spatial and temporal scales, providing valuable information on environmental and anthropogenic factors affecting vulnerable wildlife populations. However, the sheer number of images amassed can quickly outpace researchers’ ability to manually extract data from these images (e.g., species identities, counts, and behaviors) in timeframes useful for making scientifically-guided conservation and management decisions. Here, we present ‘Snapshot Safari’ as a case study for merging citizen science and machine learning to rapidly generate highly accurate ecological Big Data from camera trap surveys. Snapshot Safari is a collaborative cross-continental research and conservation effort with 1500+ cameras deployed at over 40 eastern and southern Africa protected areas, generating millions of images per year. As one of the first and largest-scale camera trapping initiatives, Snapshot Safari spearheaded innovative developments in citizen science and machine learning. We highlight the advances made and discuss the issues that arose using each of these methods to annotate camera trap data. We end by describing how we combined human and machine classification methods (‘Crowd AI’) to create an efficient integrated data pipeline. Ultimately, by using a feedback loop in which humans validate machine learning predictions and machine learning algorithms are iteratively retrained on new human classifications, we can capitalize on the strengths of both methods of classification while mitigating the weaknesses. Using Crowd AI to quickly and accurately ‘unlock’ ecological Big Data for use in science and conservation is revolutionizing the way we take on critical environmental issues in the Anthropocene era. 
    more » « less
  2. null (Ed.)
    With increasing automation, the ‘human’ element in industrial systems is gradually being reduced, often for the sake of standardization. Complete automation, however, might not be optimal in complex, uncertain environments due to the dynamic and unstructured nature of interactions. Leveraging human perception and cognition can prove fruitful in making automated systems robust and sustainable. “Human-in-the-loop” (HITL) systems are systems which incorporate meaningful human interactions into the workflow. Agricultural Robotic Systems (ARS), developed for the timely detection and prevention of diseases in agricultural crops, are an example of cyber-physical systems where HITL augmentation can provide improved detection capabilities and system performance. Humans can apply their domain knowledge and diagnostic skills to fill in the knowledge gaps present in agricultural robotics and make them more resilient to variability. Owing to the multi-agent nature of ARS, HUB-CI, a collaborative platform for the optimization of interactions between agents is emulated to direct workflow logic. The challenge remains in designing and integrating human roles and tasks in the automated loop. This article explains the development of a HITL simulation for ARS, by first realistically modeling human agents, and exploring two different modes by which they can be integrated into the loop: Sequential, and Shared Integration. System performance metrics such as costs, number of tasks, and classification accuracy are measured and compared for different collaboration protocols. The results show the statistically significant advantages of HUB-CI protocols over the traditional protocols for each integration, while also discussing the competitive factors of both integration modes. Strengthening human modeling and expanding the range of human activities within the loop can help improve the practicality and accuracy of the simulation in replicating a HITL-ARS. 
    more » « less
  3. AI has revolutionized the processing of various services, including the automatic facial verification of people. Automated approaches have demonstrated their speed and efficiency in verifying a large volume of faces, but they can face challenges when processing content from certain communities, including communities of people of color. This challenge has prompted the adoption of "human-inthe-loop" (HITL) approaches, where human workers collaborate with the AI to minimize errors. However, most HITL approaches do not consider workers’ individual characteristics and backgrounds. This paper proposes a new approach, called Inclusive Portraits (IP), that connects with social theories around race to design a racially-aware human-in-the-loop system. Our experiments have provided evidence that incorporating race into human-in-the-loop (HITL) systems for facial verification can significantly enhance performance, especially for services delivered to people of color. Our findings also highlight the importance of considering individual worker characteristics in the design of HITL systems, rather than treating workers as a homogenous group. Our research has significant design implications for developing AI-enhanced services that are more inclusive and equitable. 
    more » « less
  4. Abstract The world’s coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe “Coast Train,” a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images. 
    more » « less
  5. The development of artificial intelligence (AI) provides an opportunity for rapid and accurate assessment of earthquake-induced infrastructure damage using social media images. Nevertheless, data collection and labeling remain challenging due to limited expertise among annotators. This study introduces a novel four-class Earthquake Infrastructure Damage (EID) assessment data set compiled from a combination of images from several other social media image databases but with added emphasis on data quality. Unlike the previous data sets such as Damage Assessment Dataset (DAD) and Crisis Benchmark, the EID includes comprehensive labeling guidelines and a multiclass classification system aligned with established damage scales, such as HAZUS and EMS-98, to enhance the accuracy and utility of social media imagery for disaster response. By integrating detailed descriptions and clear labeling criteria, the labeling approach of EID reduces the subjective nature of image labeling and the inconsistencies found in existing data sets. The findings demonstrate a significant improvement in annotator agreement, reducing disagreement from 39.7% to 10.4%, thereby validating the efficacy of the refined labeling strategy. The EID, containing 13,513 high-quality images from five significant earthquakes, is designed to support community-level assessments and advanced computational research, paving the way for enhanced disaster response strategies through improved data utilization and analysis. The data set is available at DesignSafe:https://doi.org/10.17603/ds2-yj8p-hs62. 
    more » « less