skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: CSafe: An Intelligent Audio Wearable Platform for Improving Construction Worker Safety in Urban Environments
Vehicle accidents are one of the greatest cause of death and injury in urban areas for pedestrians, workers, and police alike. In this work, we present CSafe, a low power audio-wearable platform that detects, localizes, and provides alerts about oncoming vehicles to improve construction worker safety. Construction worker safety is a much more challenging problem than general urban or pedestrian safety in that the sound of construction tools can be up to orders of magnitude greater than that of vehicles, making vehicle detection and localization exceptionally difficult. To overcome these challenges, we develop a novel sound source separation algorithm, called Probabilistic Template Matching (PTM), as well as a novel noise filtering architecture to remove loud construction noises from our observed signals. We show that our architecture can improve vehicle detection by up to 12% over other state-of-art source separation algorithms. We integrate PTM and our noise filtering architecture into CSafe and show through a series of real-world experiments that CSafe can achieve up to an 82% vehicle detection rate and a 6.90° mean localization error in acoustically noisy construction site scenarios, which is 16% higher and almost 30° lower than the state-of-art audio wearable safety works.  more » « less
Award ID(s):
1704899 1815274
PAR ID:
10230890
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 20th International Conference on Information Processing in Sensor Networks
Page Range / eLocation ID:
207 to 221
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the prevalence of smartphones, pedestrians and joggers today often walk or run while listening to music. Since they are deprived of their auditory senses that would have provided important cues to dangers, they are at a much greater risk of being hit by cars or other vehicles. In this paper, we build a wearable system that uses multi-channel audio sensors embedded in a headset to help detect and locate cars from their honks, engine and tire noises, and warn pedestrians of imminent dangers of approaching cars. We demonstrate that using a segmented architecture consisting of headset-mounted audio sensors, a front-end hardware platform that performs signal processing and feature extraction, and machine learning based classification on a smartphone, we are able to provide early danger detection in real-time, from up to 60m away, and alert the user with low latency and high accuracy. To further reduce power consumption of the battery-powered wearable headset, we implement a custom-designed integrated circuit that is able to compute delays between multiple channels of audio with nW power consumption. A regression-based method for sound source localization, AvPR, is proposed and used in combination with the IC to improve the granularity and robustness of localization. 
    more » « less
  2. With the prevalence of smartphones, pedestrians and joggers today often walk or run while listening to music. Since they are deprived of their auditory senses that would have provided important cues to dangers, they are at a much greater risk of being hit by cars or other vehicles. In this paper, we build a wearable system that uses multi-channel audio sensors embedded in a headset to help detect and locate cars from their honks, engine and tire noises, and warn pedestrians of imminent dangers of approaching cars. We demonstrate that using a segmented architecture and implementation consisting of headset-mounted audio sensors, a front-end hardware that performs signal processing and feature extraction, and machine learning based classification on a smartphone, we are able to provide early danger detection in real-time, from up to 60m distance, near 100% precision on the vehicle detection and alert the user with low latency. 
    more » « less
  3. Spatial sound reasoning is a fundamental human skill, enabling us to navigate and interpret our surroundings based on sound. In this paper we present BAT, which combines the spatial sound perception ability of a binaural acoustic scene analysis model with the natural language reasoning capabilities of a large language model (LLM) to replicate this innate ability. To address the lack of existing datasets of in-the-wild spatial sounds, we synthesized a binaural audio dataset using AudioSet and SoundSpaces 2.0. Next, we developed SpatialSoundQA, a spatial sound-based question-answering dataset, offering a range of QA tasks that train BAT in various aspects of spatial sound perception and reasoning. The acoustic front end encoder of BAT is a novel spatial audio encoder named Spatial Audio Spectrogram Transformer, or Spatial-AST, which by itself achieves strong performance across sound event detection, spatial localization, and distance estimation. By integrating Spatial-AST with LLaMA-2 7B model, BAT transcends standard Sound Event Localization and Detection (SELD) tasks, enabling the model to reason about the relationships between the sounds in its environment. Our experiments demonstrate BAT's superior performance on both spatial sound perception and reasoning, showcasing the immense potential of LLMs in navigating and interpreting complex spatial audio environments. 
    more » « less
  4. The joint analysis of audio and video is a powerful tool that can be applied to various contexts, including action, speech, and sound recognition, audio-visual video parsing, emotion recognition in affective computing, and self-supervised training of deep learning models. Solving these problems often involves tackling core audio-visual tasks, such as audio-visual source localization, audio-visual correspondence, and audio-visual source separation, which can be combined in various ways to achieve the desired results. This paper provides a review of the literature in this area, discussing the advancements, history, and datasets of audio-visual learning methods for various application domains. It also presents an overview of the reported performances on standard datasets and suggests promising directions for future research. 
    more » « less
  5. Combustion vehicle emissions contribute to poor air quality and release greenhouse gases into the atmosphere, and vehicle pollution has been associated with numerous adverse health effects. Roadways with extensive waiting and/or passenger drop-off, such as schools and hospital drop-off zones, can result in a high incidence and density of idling vehicles. This can produce micro-climates of increased vehicle pollution. Thus, the detection of idling vehicles can be helpful in monitoring and responding to unnecessary idling and be integrated into real-time or off-line systems to address the resulting pollution. In this paper, we present a real-time, dynamic vehicle idling detection algorithm. The proposed idle detection algorithm and notification rely on an algorithm to detect these idling vehicles. The proposed method relies on a multisensor, audio-visual, machine-learning workflow to detect idling vehicles visually under three conditions: moving, static with the engine on, and static with the engine off. The visual vehicle motion detector is built in the first stage, and then a contrastive-learning-based latent space is trained for classifying static vehicle engine sound. We test our system in real-time at a hospital drop-off point in Salt Lake City. This in situ dataset was collected and annotated, and it includes vehicles of varying models and types. The experiments show that the method can detect engine switching on or off instantly and achieves 71.02 average precision (AP) for idle detection and 91.06 for engine off detection. 
    more » « less