skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Validating deep learning seabed classification via acoustic similarity
While seabed characterization methods have often focused on estimating individual sediment parameters, deep learning suggests a class-based approach focusing on the overall acoustic effect. A deep learning classifier—trained on 1D synthetic waveforms from underwater explosive sources—can distinguish 13 seabed classes. These classes are distinct according to a proposed metric of acoustic similarity. When tested on seabeds not used in training, the classifier obtains 96% accuracy for matching such a seabed to one of the top-3 most acoustically similar classes from the 13 training seabeds. This approach quantifies the performance of a seabed classifier in the face of real seabed variability.  more » « less
Award ID(s):
1757998
PAR ID:
10589279
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Acoustical Society of America (ASA)
Date Published:
Journal Name:
JASA Express Letters
Volume:
1
Issue:
4
ISSN:
2691-1191
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep learning is an important technique for extracting value from big data. However, the effectiveness of deep learning requires large volumes of high quality training data. In many cases, the size of training data is not large enough for effectively training a deep learning classifier. Data augmentation is a widely adopted approach for increasing the amount of training data. But the quality of the augmented data may be questionable. Therefore, a systematic evaluation of training data is critical. Furthermore, if the training data is noisy, it is necessary to separate out the noise data automatically. In this paper, we propose a deep learning classifier for automatically separating good training data from noisy data. To effectively train the deep learning classifier, the original training data need to be transformed to suit the input format of the classifier. Moreover, we investigate different data augmentation approaches to generate sufficient volume of training data from limited size original training data. We evaluated the quality of the training data through cross validation of the classification accuracy with different classification algorithms. We also check the pattern of each data item and compare the distributions of datasets. We demonstrate the effectiveness of the proposed approach through an experimental investigation of automated classification of massive biomedical images. Our approach is generic and is easily adaptable to other big data domains. 
    more » « less
  2. Deep learning is an important technique for extracting value from big data. However, the effectiveness of deep learning requires large volumes of high quality training data. In many cases, the size of training data is not large enough for effectively training a deep learning classifier. Data augmentation is a widely adopted approach for increasing the amount of training data. But the quality of the augmented data may be questionable. Therefore, a systematic evaluation of training data is critical. Furthermore, if the training data is noisy, it is necessary to separate out the noise data automatically. In this paper, we propose a deep learning classifier for automatically separating good training data from noisy data. To effectively train the deep learning classifier, the original training data need to be transformed to suit the input format of the classifier. Moreover, we investigate different data augmentation approaches to generate sufficient volume of training data from limited size original training data. We evaluated the quality of the training data through cross validation of the classification accuracy with different classification algorithms. We also check the pattern of each data item and compare the distributions of datasets. We demonstrate the effectiveness of the proposed approach through an experimental investigation of automated classification of massive biomedical images. Our approach is generic and is easily adaptable to other big data domains. 
    more » « less
  3. null (Ed.)
    The advent of deep learning algorithms for mobile devices and sensors has led to a dramatic expansion in the availability and number of systems trained on a wide range of machine learning tasks, creating a host of opportunities and challenges in the realm of transfer learning. Currently, most transfer learning methods require some kind of control over the systems learned, either by enforcing constraints dur- ing the source training, or through the use of a joint optimization objective between tasks that requires all data be co-located for training. However, for practical, pri- vacy, or other reasons, in a variety of applications we may have no control over the individual source task training, nor access to source training samples. Instead we only have access to features pre-trained on such data as the output of “black-boxes.” For such scenarios, we consider the multi-source learning problem of training a classifier using an ensemble of pre-trained neural networks for a set of classes that have not been observed by any of the source networks, and for which we have very few training samples. We show that by using these distributed networks as feature extractors, we can train an effective classifier in a computationally-efficient manner using tools from (nonlinear) maximal correlation analysis. In particular, we develop a method we refer to as maximal correlation weighting (MCW) to build the required target classifier from an appropriate weighting of the feature functions from the source networks. We illustrate the effectiveness of the resulting classi- fier on datasets derived from the CIFAR-100, Stanford Dogs, and Tiny ImageNet datasets, and, in addition, use the methodology to characterize the relative value of different source tasks in learning a target task. 
    more » « less
  4. Although the application of deep learning to automatic speech recognition (ASR) has resulted in dramatic reductions in word error rate for languages with abundant training data, ASR for languages with few resources has yet to benefit from deep learning to the same extent. In this paper, we investigate various methods of acoustic modeling and data augmentation with the goal of improving the accuracy of a deep learning ASR framework for a low-resource language with a high baseline word error rate. We compare several methods of generating synthetic acoustic training data via voice transformation and signal distortion, and we explore several strategies for integrating this data into the acoustic training pipeline. We evaluate our methods on an indigenous language of North America with minimal training resources. We show that training initially via transfer learning from an existing high-resource language acoustic model, refining weights using a heavily concentrated synthetic dataset, and finally fine-tuning to the target language using limited synthetic data reduces WER by 15% over just transfer learning using deep recurrent methods. Further, we show improvements over traditional frameworks by 19% using a similar multistage training with deep convolutional approaches. 
    more » « less
  5. Abstract BackgroundJoint acoustic emissions from knees have been evaluated as a convenient, non-invasive digital biomarker of inflammatory knee involvement in a small cohort of children with Juvenile Idiopathic Arthritis (JIA). The objective of the present study was to validate this in a larger cohort. FindingsA total of 116 subjects (86 JIA and 30 healthy controls) participated in this study. Of the 86 subjects with JIA, 43 subjects had active knee involvement at the time of study. Joint acoustic emissions were bilaterally recorded, and corresponding signal features were used to train a machine learning algorithm (XGBoost) to classify JIA and healthy knees. All active JIA knees and 80% of the controls were used as training data set, while the remaining knees were used as testing data set. Leave-one-leg-out cross-validation was used for validation on the training data set. Validation on the training and testing set of the classifier resulted in an accuracy of 81.1% and 87.7% respectively. Sensitivity / specificity for the training and testing validation was 88.6% / 72.3% and 88.1% / 83.3%, respectively. The area under the curve of the receiver operating characteristic curve was 0.81 for the developed classifier. The distributions of the joint scores of the active and inactive knees were significantly different. ConclusionJoint acoustic emissions can serve as an inexpensive and easy-to-use digital biomarker to distinguish JIA from healthy controls. Utilizing serial joint acoustic emission recordings can potentially help monitor disease activity in JIA affected joints to enable timely changes in therapy. 
    more » « less