skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: KABR: In-Situ Dataset for Kenyan Animal Behavior Recognition from Drone Videos
We present a novel dataset for animal behavior recognition collected in-situ using video from drones flown over the Mpala Research Centre in Kenya. Videos from DJI Mavic 2S drones flown in January 2023 were acquired at 5.4K resolution in accordance with IACUC protocols, and processed to detect and track each animal in the frames. An image subregion centered on each animal was extracted and combined in sequence to form a “mini-scene”. Be-haviors were then manually labeled for each frame of each mini-scene by a team of annotators overseen by an expert behavioral ecologist. The resulting labeled mini-scenes form our resulting behavior dataset, consisting of more than 10 hours of annotated videos of reticulated gi-raffes, plains zebras, and Grevy's zebras, and encompassing seven types of animal behavior and an additional category for occlusions. Benchmark results for state-of-the-art behavioral recognition architectures show labeling accu-racy of 61.9% for macro-average (per class), and 86.7% for micro-average (per instance). Our dataset complements recent larger, more diverse animal behavior sets and smaller, more specialized ones by being collected in-situ and from drones, both important considerations for the future of an-imal behavior research. The dataset can be accessed at https://dirtmaxim.github.io/kabr.  more » « less
Award ID(s):
2118240
PAR ID:
10530250
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-7028-7
Page Range / eLocation ID:
31 to 40
Subject(s) / Keyword(s):
In-situ imageomics video from drones animal behavior behavior recognition from video
Format(s):
Medium: X
Location:
Waikoloa, HI, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we extend the dataset statistics, model benchmarks, and performance analysis for the recently published KABR dataset, an in situ dataset for ungulate behavior recognition using aerial footage from the Mpala Research Centre in Kenya. The dataset comprises video footage of reticulated giraffes (lat. Giraffa reticulata), Plains zebras (lat. Equus quagga), and Grévy’s zebras (lat. Equus grevyi) captured using a DJI Mavic 2S drone. It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert behavioral ecologist. In total, KABR has more than 10 hours of annotated video. We extend the previous work in four key areas by: (i) providing comprehensive dataset statistics to reveal new insights into the data distribution across behavior classes and species; (ii) extending the set of existing benchmark models to include a new state-of-the-art transformer; (iii) investigating weight initialization strategies and exploring whether pretraining on human action recognition datasets is transferable to in situ animal behavior recognition directly (i.e., zero-shot) or as initialization for end-to-end model training; and (iv) performing a detailed statistical analysis of the performance of these models across species, behavior, and formally defined segments of the long-tailed distribution. The KABR dataset addresses the limitations of previous datasets sourced from controlled environments, offering a more authentic representation of natural animal behaviors. This work marks a significant advancement in the automatic analysis of wildlife behavior, leveraging drone technology to overcome traditional observational challenges and enabling a more nuanced understanding of animal interactions in their natural habitats. The dataset is available at https://kabrdata.xyz 
    more » « less
  2. Using unmanned aerial vehicles (UAVs) to track multiple individuals simultaneously in their natural environment is a powerful approach for better understanding the collective behavior of primates. Previous studies have demonstrated the feasibility of automating primate behavior classification from video data, but these studies have been carried out in captivity or from ground-based cameras. However, to understand group behavior and the self-organization of a collective, the whole troop needs to be seen at a scale where behavior can be seen in relation to the natural environment in which ecological decisions are made. To tackle this challenge, this study presents a novel dataset for baboon detection, tracking, and behavior recognition from drone videos where troops are observed on-the-move in their natural environment as they move to and from their sleeping sites. Videos were captured from drones at Mpala Research Centre, a research station located in Laikipia County, in central Kenya. The baboon detection dataset was created by manually annotating all baboons in drone videos with bounding boxes. A tiling method was subsequently applied to create a pyramid of images at various scales from the original 5.3K resolution images, resulting in approximately 30K images used for baboon detection. The baboon tracking dataset is derived from the baboon detection dataset, where bounding boxes are consistently assigned the same ID throughout the video. This process resulted in half an hour of dense tracking data. The baboon behavior recognition dataset was generated by converting tracks into mini-scenes, a video subregion centered on each animal. These mini-scenes were annotated with 12 distinct behavior types and one additional category for occlusion, resulting in over 20 hours of data. Benchmark results show mean average precision (mAP) of 92.62% for the YOLOv8-X detection model, multiple object tracking precision (MOTP) of 87.22% for the DeepSORT tracking algorithm, and micro top-1 accuracy of 64.89% for the X3D behavior recognition model. Using deep learning to rapidly and accurately classify wildlife behavior from drone footage facilitates non-invasive data collection on behavior enabling the behavior of a whole group to be systematically and accurately recorded. The dataset can be accessed at https://baboonland.xyz. 
    more » « less
  3. Wildfires are one of the deadliest and dangerous natural disasters in the world. Wildfires burn millions of forests and they put many lives of humans and animals in danger. Predicting fire behavior can help firefighters to have better fire management and scheduling for future incidents and also it reduces the life risks for the firefighters. Recent advance in aerial images shows that they can be beneficial in wildfire studies. Among different methods and technologies for aerial images, Unmanned Aerial Vehicles (UAVs) and drones are beneficial to collect information regarding the fire. This study provides an aerial imagery dataset using drones during a prescribed pile fire in Northern Arizona, USA. This dataset consists of different repositories including raw aerial videos recorded by drones' cameras and also raw heatmap footage recorded by an infrared thermal camera. To help researchers, two well-known studies; fire classification and fire segmentation are defined based on the dataset. For approaches such as Neural Networks (NNs) and fire classification, 39,375 frames are labeled ("Fire" vs "Non-Fire") for the training phase. Also, another 8,617 frames are labeled for the test data. 2,003 frames are considered for the fire segmentation and regarding that, 2,003 masks are generated for the purpose of Ground Truth data with pixel-wise annotation. 
    more » « less
  4. The arctic is warming at three times the rate of the global average, affecting the habitat and lifecycles of migratory species that reproduce there, like birds and caribou. Ecoacoustic monitoring can help efficiently track changes in animal phenology and behavior over large areas so that the impacts of climate change on these species can be better understood and potentially mitigated. We introduce here the Ecoacoustic Dataset from Arctic North Slope Alaska (EDANSA-2019), a dataset collected by a network of 100 autonomous recording units covering an area of 9000 square miles over the course of the 2019 summer season on the North Slope of Alaska and neighboring regions. We labeled over 27 hours of this dataset according to 28 tags with enough instances of 9 important environmental classes to train baseline convolutional recognizers. We are releasing this dataset and the corresponding baseline to the community to accelerate the recognition of these sounds and facilitate automated analyses of large-scale ecoacoustic databases. 
    more » « less
  5. We address the problem of human action classification in drone videos. Due to the high cost of capturing and labeling large-scale drone videos with diverse actions, we present unsupervised and semi-supervised domain adaptation approaches that leverage both the existing fully annotated action recognition datasets and unannotated (or only a few annotated) videos from drones. To study the emerging problem of drone-based action recognition, we create a new dataset, NEC-DRONE, containing 5,250 videos to evaluate the task. We tackle both problem settings with 1) same and 2) different action label sets for the source (e.g., Kinectics dataset) and target domains (drone videos). We present a combination of video and instance-based adaptation methods, paired with either a classifier or an embedding-based framework to transfer the knowledge from source to target. Our results show that the proposed adaptation approach substantially improves the performance on these challenging and practical tasks. We further demonstrate the applicability of our method for learning cross-view action recognition on the Charades-Ego dataset. We provide qualitative analysis to understand the behaviors of our approaches. 
    more » « less