skip to main content

Title: Feature-Based Fusion Using CNN for Lung and Heart Sound Classification
Lung or heart sound classification is challenging due to the complex nature of audio data, its dynamic properties of time, and frequency domains. It is also very difficult to detect lung or heart conditions with small amounts of data or unbalanced and high noise in data. Furthermore, the quality of data is a considerable pitfall for improving the performance of deep learning. In this paper, we propose a novel feature-based fusion network called FDC-FS for classifying heart and lung sounds. The FDC-FS framework aims to effectively transfer learning from three different deep neural network models built from audio datasets. The innovation of the proposed transfer learning relies on the transformation from audio data to image vectors and from three specific models to one fused model that would be more suitable for deep learning. We used two publicly available datasets for this study, i.e., lung sound data from ICHBI 2017 challenge and heart challenge data. We applied data augmentation techniques, such as noise distortion, pitch shift, and time stretching, dealing with some data issues in these datasets. Importantly, we extracted three unique features from the audio samples, i.e., Spectrogram, MFCC, and Chromagram. Finally, we built a fusion of three optimal convolutional neural network models by feeding the image feature vectors transformed from audio features. We confirmed the superiority of the proposed fusion model compared to the state-of-the-art works. The highest accuracy we achieved with FDC-FS is 99.1% with Spectrogram-based lung sound classification while 97% for Spectrogram and Chromagram based heart sound classification.  more » « less
Award ID(s):
1935076 1951971 1747751
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Messinger, David W. ; Velez-Reyes, Miguel (Ed.)
    Recently, multispectral and hyperspectral data fusion models based on deep learning have been proposed to generate images with a high spatial and spectral resolution. The general objective is to obtain images that improve spatial resolution while preserving high spectral content. In this work, two deep learning data fusion techniques are characterized in terms of classification accuracy. These methods fuse a high spatial resolution multispectral image with a lower spatial resolution hyperspectral image to generate a high spatial-spectral hyperspectral image. The first model is based on a multi-scale long short-term memory (LSTM) network. The LSTM approach performs the fusion using a multiple step process that transitions from low to high spatial resolution using an intermediate step capable of reducing spatial information loss while preserving spectral content. The second fusion model is based on a convolutional neural network (CNN) data fusion approach. We present fused images using four multi-source datasets with different spatial and spectral resolutions. Both models provide fused images with increased spatial resolution from 8m to 1m. The obtained fused images using the two models are evaluated in terms of classification accuracy on several classifiers: Minimum Distance, Support Vector Machines, Class-Dependent Sparse Representation and CNN classification. The classification results show better performance in both overall and average accuracy for the images generated with the multi-scale LSTM fusion over the CNN fusion 
    more » « less
  2. null (Ed.)
    Abstract: Deep Learning (DL) has made significant changes to a large number of research areas in recent decades. For example, several astonishing Convolutional Neural Network (CNN) models have been built by researchers to fulfill image classification needs using large-scale visual datasets successfully. Transfer Learning (TL) makes use of those pre-trained models to ease the feature learning process for other target domains that contain a smaller amount of training data. Currently, there are numerous ways to utilize features generated by transfer learning. Pre-trained CNN models prepare mid-/high-level features to work for different targeting problem domains. In this paper, a DL feature and model selection framework based on evolutionary programming is proposed to solve the challenges in visual data classification. It automates the process of discovering and obtaining the most representative features generated by the pre-trained DL models for different classification tasks. 
    more » « less
  3. null (Ed.)
    Discovering word-like units without textual transcriptions is an important step in low-resource speech technology. In this work,we demonstrate a model inspired by statistical machine translation and hidden Markov model/deep neural network (HMM-DNN) hybrid systems. Our learning algorithm is capable of discovering the visual and acoustic correlates of distinct words in an unknown language by simultaneously learning the map-ping from image regions to concepts (the first DNN), the map-ping from acoustic feature vectors to phones (the second DNN),and the optimum alignment between the two (the HMM). In the simulated low-resource setting using MSCOCO and Speech-COCO datasets, our model achieves 62.4 % alignment accuracy and outperforms the audio-only segmental embedded GMM approach on standard word discovery evaluation metrics. 
    more » « less
  4. We introduce an active, semisupervised algorithm that utilizes Bayesian experimental design to address the shortage of annotated images required to train and validate Artificial Intelligence (AI) models for lung cancer screening with computed tomography (CT) scans. Our approach incorporates active learning with semisupervised expectation maximization to emulate the human in the loop for additional ground truth labels to train, evaluate, and update the neural network models. Bayesian experimental design is used to intelligently identify which unlabeled samples need ground truth labels to enhance the model’s performance. We evaluate the proposed Active Semi-supervised Expectation Maximization for Computer aided diagnosis (CAD) tasks (ASEM-CAD) using three public CT scans datasets: the National Lung Screening Trial (NLST), the Lung Image Database Consortium (LIDC), and Kaggle Data Science Bowl 2017 for lung cancer classification using CT scans. ASEM-CAD can accurately classify suspicious lung nodules and lung cancer cases with an area under the curve (AUC) of 0.94 (Kaggle), 0.95 (NLST), and 0.88 (LIDC) with significantly fewer labeled images compared to a fully supervised model. This study addresses one of the significant challenges in early lung cancer screenings using low-dose computed tomography (LDCT) scans and is a valuable contribution towards the development and validation of deep learning algorithms for lung cancer screening and other diagnostic radiology examinations. 
    more » « less
  5. Doulamis, Anastasios D. (Ed.)
    Hyperspectral imaging is an area of active research with many applications in remote sensing, mineral exploration, and environmental monitoring. Deep learning and, in particular, convolution-based approaches are the current state-of-the-art classification models. However, in the presence of noisy hyperspectral datasets, these deep convolutional neural networks underperform. In this paper, we proposed a feature augmentation approach to increase noise resistance in imbalanced hyperspectral classification. Our method calculates context-based features, and it uses a deep convolutional neuronet (DCN). We tested our proposed approach on the Pavia datasets and compared three models, DCN, PCA + DCN, and our context-based DCN, using the original datasets and the datasets plus noise. Our experimental results show that DCN and PCA + DCN perform well on the original datasets but not on the noisy datasets. Our robust context-based DCN was able to outperform others in the presence of noise and was able to maintain a comparable classification accuracy on clean hyperspectral images. 
    more » « less