ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing

Tiwari, Aditi; Nahrstedt, Klara

doi:10.1109/SMARTCOMP65954.2025.00053

Citation Details

This content will become publicly available on June 16, 2026

ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing

Effective training and debriefing are critical in high-stakes, mission-critical environments such as firefighting, where precision and error minimization are paramount. The traditional post-training analysis relies on the manual review of 2D video, a process that is time-consuming and lacks comprehensive situational awareness. To address these limitations, we introduce ACT360, a novel system that leverages 360-degree video and machine learning for automated action detection and efficient debriefing. ACT360 incorporates 360YOWO, a customized You Only Watch Once (YOWO) model enhanced with a spatial attention mechanism and equirectangular-aware convolution (EAC) to handle the unique distortions of panoramic video data. To enable deployment in resource-constrained environments, we apply quantization and model pruning, reducing the model size by 74% while maintaining robust accuracy (mAP drop of only 1.5 %, from 0.865 to 0.850) and improving inference speed. We validate our approach on a new, publicly available dataset of 55 labeled 360-degree videos covering seven key firefighting actions, recorded across various real-world practice sessions and environmental conditions. Furthermore, we integrate the pipeline with 360AIE (Action Insight Explorer), a web-based interface that provides automatic action detection, retrieval, and textual summarization of key events using large language models (LLMs), significantly improving post-incident analysis efficiency. ACT360 serves as a generalized framework for mission-critical debriefing, incorporating techniques such as EAC, spatial attention, summarization, and model optimization. These innovations apply to any training environment requiring lightweight action detection and structured nost-exercise analysis. more »

Award ID(s):: 2140645 2106592 1900875

PAR ID:: 10635552

Author(s) / Creator(s):: Tiwari, Aditi; Nahrstedt, Klara

Publisher / Repository:: IEEE

Date Published:: 2025-06-16

ISBN:: 979-8-3315-8646-1

Page Range / eLocation ID:: 34 to 41

Subject(s) / Keyword(s):: Action Detection Distortion Optimal Model Attention Mechanism Model Size Spatial Attention Manual Review Training Environment Spatial Attention Mechanism 2D Video 360-degree Video Latitude Emergency Response Convolutional Layers Nighttime Exercise Training Object Detection Convolution Operation Action Recognition 2D Model Relevant Regions Inference Time Mission-critical Applications Video Understanding Google Cloud Platform Temporal Model Manual Selection Accuracy Of Feature Extraction

Format(s):: Medium: X

Location:: Cork, Ireland

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 16, 2026
Conference Paper:
https://doi.org/10.1109/SMARTCOMP65954.2025.00053

More Like this