skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An audio‐based risky flight detection framework for quadrotors
Abstract Drones have increasingly collaborated with human workers in some workspaces, such as warehouses. The failure of a drone flight may bring potential risks to human beings' life safety during some aerial tasks. One of the most common flight failures is triggered by damaged propellers. To quickly detect physical damage to propellers, recognise risky flights, and provide early warnings to surrounding human workers, a new and comprehensive fault diagnosis framework is presented that uses only the audio caused by propeller rotation without accessing any flight data. The diagnosis framework includes three components: leverage convolutional neural networks, transfer learning, and Bayesian optimisation. Particularly, the audio signal from an actual flight is collected and transferred into time–frequency spectrograms. First, a convolutional neural network‐based diagnosis model that utilises these spectrograms is developed to identify whether there is any broken propeller involved in a specific drone flight. Additionally, the authors employ Monte Carlo dropout sampling to obtain the inconsistency of diagnostic results and compute the mean probability score vector's entropy (uncertainty) as another factor to diagnose the drone flight. Next, to reduce data dependence on different drone types, the convolutional neural network‐based diagnosis model is further augmented by transfer learning. That is, the knowledge of a well‐trained diagnosis model is refined by using a small set of data from a different drone. The modified diagnosis model has the ability to detect the broken propeller of the second drone. Thirdly, to reduce the hyperparameters' tuning efforts and reinforce the robustness of the network, Bayesian optimisation takes advantage of the observed diagnosis model performances to construct a Gaussian process model that allows the acquisition function to choose the optimal network hyperparameters. The proposed diagnosis framework is validated via real experimental flight tests and has a reasonably high diagnosis accuracy.  more » « less
Award ID(s):
2046481
PAR ID:
10485580
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1049
Date Published:
Journal Name:
IET Cyber-Systems and Robotics
Volume:
6
Issue:
1
ISSN:
2097-3608
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The rapid rise of accessibility of unmanned aerial vehicles or drones pose a threat to general security and confidentiality. Most of the commercially available or custom-built drones are multi-rotors and are comprised of multiple propellers. Since these propellers rotate at a high-speed, they are generally the fastest moving parts of an image and cannot be directly "seen" by a classical camera without severe motion blur. We utilize a class of sensors that are particularly suitable for such scenarios called event cameras, which have a high temporal resolution, low-latency, and high dynamic range. In this paper, we model the geometry of a propeller and use it to generate simulated events which are used to train a deep neural network called EVPropNet to detect propellers from the data of an event camera. EVPropNet directly transfers to the real world without any fine-tuning or retraining. We present two applications of our network: (a) tracking and following an unmarked drone and (b) landing on a near-hover drone. We successfully evaluate and demonstrate the proposed approach in many real-world experiments with different propeller shapes and sizes. Our network can detect propellers at a rate of 85.1% even when 60% of the propeller is occluded and can run at upto 35Hz on a 2W power budget. To our knowledge, this is the first deep learning-based solution for detecting propellers (to detect drones). Finally, our applications also show an impressive success rate of 92% and 90% for the tracking and landing tasks respectively. 
    more » « less
  2. Lung or heart sound classification is challenging due to the complex nature of audio data, its dynamic properties of time, and frequency domains. It is also very difficult to detect lung or heart conditions with small amounts of data or unbalanced and high noise in data. Furthermore, the quality of data is a considerable pitfall for improving the performance of deep learning. In this paper, we propose a novel feature-based fusion network called FDC-FS for classifying heart and lung sounds. The FDC-FS framework aims to effectively transfer learning from three different deep neural network models built from audio datasets. The innovation of the proposed transfer learning relies on the transformation from audio data to image vectors and from three specific models to one fused model that would be more suitable for deep learning. We used two publicly available datasets for this study, i.e., lung sound data from ICHBI 2017 challenge and heart challenge data. We applied data augmentation techniques, such as noise distortion, pitch shift, and time stretching, dealing with some data issues in these datasets. Importantly, we extracted three unique features from the audio samples, i.e., Spectrogram, MFCC, and Chromagram. Finally, we built a fusion of three optimal convolutional neural network models by feeding the image feature vectors transformed from audio features. We confirmed the superiority of the proposed fusion model compared to the state-of-the-art works. The highest accuracy we achieved with FDC-FS is 99.1% with Spectrogram-based lung sound classification while 97% for Spectrogram and Chromagram based heart sound classification. 
    more » « less
  3. This article addresses the automatic detection of vocal, nocturnally migrating birds from a network of acoustic sensors. Thus far, owing to the lack of annotated continuous recordings, existing methods had been benchmarked in a binary classification setting (presence vs. absence). Instead, with the aim of comparing them in event detection, we release BirdVox-full-night, a dataset of 62 hours of audio comprising 35402 flight calls of nocturnally migrating birds, as recorded from 6 sensors. We find a large performance gap between energybased detection functions and data-driven machine listening. The best model is a deep convolutional neural network trained with data augmentation. We correlate recall with the density of flight calls over time and frequency and identify the main causes of false alarm 
    more » « less
  4. Pavement surveying and distress mapping is completed by roadway authorities to quantify the topical and structural damage levels for strategic preventative or rehabilitative action. The failure to time the preventative or rehabilitative action and control distress propagation can lead to severe structural and financial loss of the asset requiring complete reconstruction. Continuous and computer-aided surveying measures not only can eliminate human error when analyzing, identifying, defining, and mapping pavement surface distresses, but also can provide a database of road damage patterns and their locations. The database can be used for timely road repairs to gain the maximum durability of the asphalt and the minimum cost of maintenance. This paper introduces an autonomous surveying scheme to collect, analyze, and map the image-based distress data in real time. A descriptive approach is considered for identifying cracks from collected images using a convolutional neural network (CNN) that classifies several types of cracks. Typically, CNN-based schemes require a relatively large processing power to detect desired objects in images in real time. However, the portability objective of this work requires to utilize low-weight processing units. To that end, the CNN training was optimized by the Bayesian optimization algorithm (BOA) to achieve the maximum accuracy and minimum processing time with minimum neural network layers. First, a database consisting of a diverse population of crack distress types such as longitudinal, transverse, and alligator cracks, photographed at multiple angles, was prepared. Then, the database was used to train a CNN whose hyperparameters were optimized using BOA. Finally, a heuristic algorithm is introduced to process the CNN’s output and produce the crack map. The performance of the classifier and mapping algorithm is examined against still images and videos captured by a drone from cracked pavement. In both instances, the proposed CNN was able to classify the cracks with 97% accuracy. The mapping algorithm is able to map a diverse population of surface cracks patterns in real time at the speed of 11.1 km per hour. 
    more » « less
  5. The automatic classification of electrocardiogram (ECG) signals has played an important role in cardiovascular diseases diagnosis and prediction. Deep neural networks (DNNs), particularly Convolutional Neural Networks (CNNs), have excelled in a variety of intelligent tasks including biomedical and health informatics. Most the existing approaches either partition the ECG time series into a set of segments and apply 1D-CNNs or divide the ECG signal into a set of spectrogram images and apply 2D-CNNs. These studies, however, suffer from the limitation that temporal dependencies between 1D segments or 2D spectrograms are not considered during network construction. Furthermore, meta-data including gender and age has not been well studied in these researches. To address those limitations, we propose a multi-module Recurrent Convolutional Neural Networks (RCNNs) consisting of both CNNs to learn spatial representation and Recurrent Neural Networks (RNNs) to model the temporal relationship. Our multi-module RCNNs architecture is designed as an end-to-end deep framework with four modules: (i) timeseries module by 1D RCNNs which extracts spatio-temporal information of ECG time series; (ii) spectrogram module by 2D RCNNs which learns visual-temporal representation of ECG spectrogram ; (iii) metadata module which vectorizes age and gender information; (iv) fusion module which semantically fuses the information from three above modules by a transformer encoder. Ten-fold cross validation was used to evaluate the approach on the MIT-BIH arrhythmia database (MIT-BIH) under different network configurations. The experimental results have proved that our proposed multi-module RCNNs with transformer encoder achieves the state-of-the-art with 99.14% F1 score and 98.29% accuracy. 
    more » « less