Abstract Drones are increasingly popular for collecting behaviour data of group‐living animals, offering inexpensive and minimally disruptive observation methods. Imagery collected by drones can be rapidly analysed using computer vision techniques to extract information, including behaviour classification, habitat analysis and identification of individual animals. While computer vision techniques can rapidly analyse drone‐collected data, the success of these analyses often depends on careful mission planning that considers downstream computational requirements—a critical factor frequently overlooked in current studies.We present a comprehensive summary of research in the growing AI‐driven animal ecology (ADAE) field, which integrates data collection with automated computational analysis focused on aerial imagery for collective animal behaviour studies. We systematically analyse current methodologies, technical challenges and emerging solutions in this field, from drone mission planning to behavioural inference. We illustrate computer vision pipelines that infer behaviour from drone imagery and present the computer vision tasks used for each step. We map specific computational tasks to their ecological applications, providing a framework for future research design.Our analysis reveals AI‐driven animal ecology studies for collective animal behaviour using drone imagery focus on detection and classification computer vision tasks. While convolutional neural networks (CNNs) remain dominant for detection and classification tasks, newer architectures like transformer‐based models and specialized video analysis networks (e.g. X3D, I3D, SlowFast) designed for temporal pattern recognition are gaining traction for pose estimation and behaviour inference. However, reported model accuracy varies widely by computer vision task, species, habitats and evaluation metrics, complicating meaningful comparisons between studies.Based on current trends, we conclude semi‐autonomous drone missions will be increasingly used to study collective animal behaviour. While manual drone operation remains prevalent, autonomous drone manoeuvrers, powered by edge AI, can scale and standardise collective animal behavioural studies while reducing the risk of disturbance and improving data quality. We propose guidelines for AI‐driven animal ecology drone studies adaptable to various computer vision tasks, species and habitats. This approach aims to collect high‐quality behaviour data while minimising disruption to the ecosystem.
more »
« less
Perspectives on Individual Animal Identification from Biology and Computer Vision
Synopsis Identifying individual animals is crucial for many biological investigations. In response to some of the limitations of current identification methods, new automated computer vision approaches have emerged with strong performance. Here, we review current advances of computer vision identification techniques to provide both computer scientists and biologists with an overview of the available tools and discuss their applications. We conclude by offering recommendations for starting an animal identification project, illustrate current limitations, and propose how they might be addressed in the future.
more »
« less
- Award ID(s):
- 2010768
- PAR ID:
- 10310734
- Date Published:
- Journal Name:
- Integrative and Comparative Biology
- Volume:
- 61
- Issue:
- 3
- ISSN:
- 1540-7063
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Nonverbal communication, such as body language, facial expressions, and hand gestures, is crucial to human communication as it conveys more information about emotions and attitudes than spoken words. However, individuals who are blind or have low-vision (BLV) may not have access to this method of communication, leading to asymmetry in conversations. Developing systems to recognize nonverbal communication cues (NVCs) for the BLV community would enhance communication and understanding for both parties. This paper focuses on developing a multimodal computer vision system to recognize and detect NVCs. To accomplish our objective, we are collecting a dataset focused on nonverbal communication cues. Here, we propose a baseline model for recognizing NVCs and present initial results on the Aff-Wild2 dataset. Our baseline model achieved an accuracy of 68% and a F1-Score of 64% on the Aff-Wild2 validation set, making it comparable with previous state of the art results. Furthermore, we discuss the various challenges associated with NVC recognition as well as the limitations of our current work.more » « less
-
null (Ed.)The development of deep convolutional neural networks (CNNs) has recently led to great successes in computer vision and CNNs have become de facto computational models of vision. However, a growing body of work suggests that they exhibit critical limitations beyond image categorization. Here, we study one such fundamental limitation, for judging whether two simultaneously presented items are the same or different (SD) compared to a baseline assessment of their spatial relationship (SR). In both human subjects and artificial neural networks, we test the prediction that SD tasks recruit additional cortical mechanisms which underlie critical aspects of visual cognition that are not explained by current computational models. We thus recorded EEG signals from human participants engaged in the same tasks as the computational models. Importantly, in humans the two tasks were matched in terms of difficulty by an adaptive psychometric procedure: yet, on top of a modulation of evoked potentials, our results revealed higher activity in the low beta (16-24Hz) band in the SD compared to the SR conditions. We surmise that these oscillations reflect the crucial involvement of additional mechanisms, such as working memory and attention, which are missing in current feed-forward CNNs.more » « less
-
Photo Sleuth: Combining Collective Intelligence and Computer Vision to Identify Historical PortraitsIdentifying people in photographs is a critical task in a wide variety of domains, from national security [7] to journalism [14] to human rights investigations [1]. The task is also fundamentally complex and challenging. With the world population at 7.6 billion and growing, the candidate pool is large. Studies of human face recognition ability show that the average person incorrectly identifies two people as similar 20–30% of the time, and trained police detectives do not perform significantly better [11]. Computer vision-based face recognition tools have gained considerable ground and are now widely available commercially, but comparisons to human performance show mixed results at best [2,10,16]. Automated face recognition techniques, while powerful, also have constraints that may be impractical for many real-world contexts. For example, face recognition systems tend to suffer when the target image or reference images have poor quality or resolution, as blemishes or discolorations may be incorrectly recognized as false positives for facial landmarks. Additionally, most face recognition systems ignore some salient facial features, like scars or other skin characteristics, as well as distinctive non-facial features, like ear shape or hair or facial hair styles. This project investigates how we can overcome these limitations to support person identification tasks. By adjusting confidence thresholds, users of face recognition can generally expect high recall (few false negatives) at the cost of low precision (many false positives). Therefore, we focus our work on the “last mile” of person identification, i.e., helping a user find the correct match among a large set of similarlooking candidates suggested by face recognition. Our approach leverages the powerful capabilities of the human vision system and collaborative sensemaking via crowdsourcing to augment the complementary strengths of automatic face recognition. The result is a novel technology pipeline combining collective intelligence and computer vision. We scope this project to focus on identifying soldiers in photos from the American Civil War era (1861– 1865). An estimated 4,000,000 soldiers fought in the war, and most were photographed at least once, due to decreasing costs, the increasing robustness of the format, and the critical events separating friends and family [17]. Over 150 years later, the identities of most of these portraits have been lost, but as museums and archives increasingly digitize and publish their collections online, the pool of reference photos and information has never been more accessible. Historians, genealogists, and collectors work tirelessly to connect names with faces, using largely manual identification methods [3,9]. Identifying people in historical photos is important for preserving material culture [9], correcting the historical record [13], and recognizing contributions of marginalized groups [4], among other reasons.more » « less
-
Systems for collecting image data in conjunction with computer vision techniques are a powerful tool for increasing the temporal resolution at which plant phenotypes can be measured non-destructively. Computational tools that are flexible and extendable are needed to address the diversity of plant phenotyping problems. We previously described the Plant Computer Vision (PlantCV) software package, which is an image processing toolkit for plant phenotyping analysis. The goal of the PlantCV project is to develop a set of modular, reusable, and repurposable tools for plant image analysis that are open-source and community-developed. Here we present the details and rationale for major developments in the second major release of PlantCV. In addition to overall improvements in the organization of the PlantCV project, new functionality includes a set of new image processing and normalization tools, support for analyzing images that include multiple plants, leaf segmentation, landmark identification tools for morphometrics, and modules for machine learning.more » « less
An official website of the United States government

