Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Using unmanned aerial vehicles (UAVs) to track multiple individuals simultaneously in their natural environment is a powerful approach for better understanding the collective behavior of primates. Previous studies have demonstrated the feasibility of automating primate behavior classification from video data, but these studies have been carried out in captivity or from ground-based cameras. However, to understand group behavior and the self-organization of a collective, the whole troop needs to be seen at a scale where behavior can be seen in relation to the natural environment in which ecological decisions are made. To tackle this challenge, this study presents a novel dataset for baboon detection, tracking, and behavior recognition from drone videos where troops are observed on-the-move in their natural environment as they move to and from their sleeping sites. Videos were captured from drones at Mpala Research Centre, a research station located in Laikipia County, in central Kenya. The baboon detection dataset was created by manually annotating all baboons in drone videos with bounding boxes. A tiling method was subsequently applied to create a pyramid of images at various scales from the original 5.3K resolution images, resulting in approximately 30K images used for baboon detection. The baboon tracking dataset is derived from the baboon detection dataset, where bounding boxes are consistently assigned the same ID throughout the video. This process resulted in half an hour of dense tracking data. The baboon behavior recognition dataset was generated by converting tracks into mini-scenes, a video subregion centered on each animal. These mini-scenes were annotated with 12 distinct behavior types and one additional category for occlusion, resulting in over 20 hours of data. Benchmark results show mean average precision (mAP) of 92.62% for the YOLOv8-X detection model, multiple object tracking precision (MOTP) of 87.22% for the DeepSORT tracking algorithm, and micro top-1 accuracy of 64.89% for the X3D behavior recognition model. Using deep learning to rapidly and accurately classify wildlife behavior from drone footage facilitates non-invasive data collection on behavior enabling the behavior of a whole group to be systematically and accurately recorded. The dataset can be accessed at https://baboonland.xyz.more » « lessFree, publicly-accessible full text available June 16, 2026
-
Abstract Access to large image volumes through camera traps and crowdsourcing provides novel possibilities for animal monitoring and conservation. It calls for automatic methods for analysis, in particular, when re-identifying individual animals from the images. Most existing re-identification methods rely on either hand-crafted local features or end-to-end learning of fur pattern similarity. The former does not need labeled training data, while the latter, although very data-hungry typically outperforms the former when enough training data is available. We propose a novel re-identification pipeline that combines the strengths of both approaches by utilizing modern learnable local features and feature aggregation. This creates representative pattern feature embeddings that provide high re-identification accuracy while allowing us to apply the method to small datasets by using pre-trained feature descriptors. We report a comprehensive comparison of different modern local features and demonstrate the advantages of the proposed pipeline on two very different species.more » « less
-
In this paper, we extend the dataset statistics, model benchmarks, and performance analysis for the recently published KABR dataset, an in situ dataset for ungulate behavior recognition using aerial footage from the Mpala Research Centre in Kenya. The dataset comprises video footage of reticulated giraffes (lat. Giraffa reticulata), Plains zebras (lat. Equus quagga), and Grévy’s zebras (lat. Equus grevyi) captured using a DJI Mavic 2S drone. It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert behavioral ecologist. In total, KABR has more than 10 hours of annotated video. We extend the previous work in four key areas by: (i) providing comprehensive dataset statistics to reveal new insights into the data distribution across behavior classes and species; (ii) extending the set of existing benchmark models to include a new state-of-the-art transformer; (iii) investigating weight initialization strategies and exploring whether pretraining on human action recognition datasets is transferable to in situ animal behavior recognition directly (i.e., zero-shot) or as initialization for end-to-end model training; and (iv) performing a detailed statistical analysis of the performance of these models across species, behavior, and formally defined segments of the long-tailed distribution. The KABR dataset addresses the limitations of previous datasets sourced from controlled environments, offering a more authentic representation of natural animal behaviors. This work marks a significant advancement in the automatic analysis of wildlife behavior, leveraging drone technology to overcome traditional observational challenges and enabling a more nuanced understanding of animal interactions in their natural habitats. The dataset is available at https://kabrdata.xyzmore » « lessFree, publicly-accessible full text available December 21, 2025
-
We present a novel dataset for animal behavior recognition collected in-situ using video from drones flown over the Mpala Research Centre in Kenya. Videos from DJI Mavic 2S drones flown in January 2023 were acquired at 5.4K resolution in accordance with IACUC protocols, and processed to detect and track each animal in the frames. An image subregion centered on each animal was extracted and combined in sequence to form a “mini-scene”. Be-haviors were then manually labeled for each frame of each mini-scene by a team of annotators overseen by an expert behavioral ecologist. The resulting labeled mini-scenes form our resulting behavior dataset, consisting of more than 10 hours of annotated videos of reticulated gi-raffes, plains zebras, and Grevy's zebras, and encompassing seven types of animal behavior and an additional category for occlusions. Benchmark results for state-of-the-art behavioral recognition architectures show labeling accu-racy of 61.9% for macro-average (per class), and 86.7% for micro-average (per instance). Our dataset complements recent larger, more diverse animal behavior sets and smaller, more specialized ones by being collected in-situ and from drones, both important considerations for the future of an-imal behavior research. The dataset can be accessed at https://dirtmaxim.github.io/kabr.more » « less
-
In situ imageomics is a new approach to study ecological, biological and evolutionary systems wherein large image and video data sets are captured in the wild and machine learning methods are used to infer biological traits of individual organisms, animal social groups, species, and even whole ecosystems. Monitoring biological traits over large spaces and long periods of time could enable new, data-driven approaches to wildlife conservation, biodiversity, and sustainable ecosystem management. However, to accurately infer biological traits, machine learning methods for images require voluminous and high quality data. Adaptive, data-driven approaches are hamstrung by the speed at which data can be captured and processed. Camera traps and unmanned aerial vehicles (UAVs) produce voluminous data, but lose track of individuals over large areas, fail to capture social dynamics, and waste time and storage on images with poor lighting and view angles. In this vision paper, we make the case for a research agenda for in situ imageomics that depends on significant advances in autonomic and self-aware computing. Precisely, we seek autonomous data collection that manages camera angles, aircraft positioning, conflicting actions for multiple traits of interest, energy availability, and cost factors. Given the tools to detect object and identify individuals, we propose a research challenge: Which optimization model should the data collection system employ to accurately identify, characterize, and draw inferences from biological traits while respecting a budget? Using zebra and giraffe behavioral data collected over three weeks at the Mpala Research Centre in Laikipia County, Kenya, we quantify the volume and quality of data collected using existing approaches. Our proposed autonomic navigation policy for in situ imageomics collection has an F1 score of 82% compared to an expert pilot, and provides greater safety and consistency, suggesting great potential for state-of-the-art autonomic approaches if they can be scaled up to fully address the problem.more » « less
An official website of the United States government

Full Text Available