skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Unsupervised Training of a DNN-Based Formant Tracker
Phonetic analysis often requires reliable estimation of formants, but estimates provided by popular programs can be unreliable. Recently, Dissen et al. [1] described DNN-based formant trackers that produced more accurate frequency estimates than several others, but require manually-corrected formant data for training. Here we describe a novel unsupervised training method for corpus-based DNN formant parameter estimation and tracking with accuracy similar to [1]. Frame-wise spectral envelopes serve as the input. The output is estimates of the frequencies and bandwidths plus amplitude adjustments for a prespecified number of poles and zeros, hereafter referred to as “formant parameters.” A custom loss measure based on the difference between the input envelope and one generated from the estimated formant parameters is calculated and back- propagated through the network to establish the gradients with respect to the formant parameters. The approach is similar to that of autoencoders, in that the model is trained to reproduce its input in order to discover latent features, in this case, the formant parameters. Our results demonstrate that a reliable formant tracker can be constructed for a speech corpus without the need for hand-corrected training data.  more » « less
Award ID(s):
1816726
PAR ID:
10302319
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of InterSpeech 2021
Page Range / eLocation ID:
1189 to 1193
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    FinFET SRAM cells suffer from front-end wearout mechanisms, such as bias temperature instability and hot carrier injection. In this paper, we built a library based on deep neural networks (DNNs) to speed up the process of simulating FinFET SRAM cells' degradation. This library consists of two parts. The first part calculates circuit configuration parameters, wearout parameters, and the other input variables for the DNN. The second part calls for the DNN to determine the shifted circuit performance metrics. A DNN with more than 99% accuracy is achieved with training data from standard Hspice simulations. The correctness of the DNN is also validated in the presence of input variations. With this library, the simulation speed is one hundred times faster than Hspice simulations. We can display the cell's degradation under various configurations easily and quickly. Also, the DNN-based library can help protect intellectual property without showing users the circuit's details. 
    more » « less
  2. This paper proposes an automatic parameter selection framework for optimizing the performance of parameter-dependent regularized reconstruction algorithms. The proposed approach exploits a convolutional neural network for direct estimation of the regularization parameters from the acquired imaging data. This method can provide very reliable parameter estimates in a computationally efficient way. The effectiveness of the proposed approach is verified on transform-learning-based magnetic resonance image reconstructions of two different publicly available datasets. This experiment qualitatively and quantitatively measures improvement in image reconstruction quality using the proposed parameter selection strategy versus both existing parameter selection solutions and a fully deep-learning reconstruction with limited training data. Based on the experimental results, the proposed method improves average reconstructed image peak signal-to-noise ratio by a dB or more versus all competing methods in both brain and knee datasets, over a range of subsampling factors and input noise levels. 
    more » « less
  3. null (Ed.)
    Training Deep Neural Networks (DNNs) is resource-intensive and time-consuming. While prior research has explored many different ways of reducing DNN training time, the impact of input data pipeline , i.e., fetching raw data items from storage and performing data pre-processing in memory, has been relatively unexplored. This paper makes the following contributions: (1) We present the first comprehensive analysis of how the input data pipeline affects the training time of widely-used computer vision and audio Deep Neural Networks (DNNs), that typically involve complex data pre-processing. We analyze nine different models across three tasks and four datasets while varying factors such as the amount of memory, number of CPU threads, storage device, GPU generation etc on servers that are a part of a large production cluster at Microsoft. We find that in many cases, DNN training time is dominated by data stall time : time spent waiting for data to be fetched and pre-processed. (2) We build a tool, DS-Analyzer to precisely measure data stalls using a differential technique, and perform predictive what-if analysis on data stalls. (3) Finally, based on the insights from our analysis, we design and implement three simple but effective techniques in a data-loading library, CoorDL, to mitigate data stalls. Our experiments on a range of DNN tasks, models, datasets, and hardware configs show that when PyTorch uses CoorDL instead of the state-of-the-art DALI data loading library, DNN training time is reduced significantly (by as much as 5X on a single server). 
    more » « less
  4. When rheological models of polymer blends are used for inverse modeling, they can characterize polymer mixtures from rheological observations. This requires repeated evaluation of potentially expensive rheological models. We explored surrogate models based on Gaussian processes (GP-SM) as a cheaper alternative for describing the rheology of polydisperse binary blends. We used the time-dependent diffusion double reptation (TDD-DR) model as the true model; it takes a 5-dimensional input vector specifying the binary blend as input and yields a function called the relaxation spectrum as output. We used the TDD-DR model to generate training data of different sizes [Formula: see text], via Latin hypercube sampling. The optimal values of the GP-SM hyper-parameters, assuming a separable covariance kernel, were obtained by maximum likelihood estimation. The GP-SM interpolates the training data by design and offers reasonable predictions of relaxation spectra with uncertainty estimates. In general, the accuracy of GP-SMs improves as the size of the training data [Formula: see text] increases, as does the cost for training and prediction. The optimal hyper-parameters were found to be relatively insensitive to [Formula: see text]. Finally, we considered the inverse problem of inferring the structure of the polymer blend from a synthetic dataset generated using the true model. Surprisingly, the solution to the inverse problem obtained using GP-SMs and TDD-DR was qualitatively similar. GP-SMs can be several orders of magnitude cheaper than expensive rheological models, which provides a proof-of-concept validation for using GP-SMs for inverse problems in polymer rheology. 
    more » « less
  5. A deep neural network (DNN)-based adaptive controller with a real-time and concurrent learning (CL)-based adaptive update law is developed for a class of uncertain, nonlinear dynamic systems. The DNN in the control law is used to approximate the uncertain nonlinear dynamic model. The inner-layer weights of the DNN are updated offline using data collected in real-time; whereas, the output-layer DNN weights are updated online (i.e., in real-time) using the Lyapunov- and CL-based adaptation law. Specifically, the inner-layer weights of the DNN are trained offline (concurrent to real-time execution) after a sufficient amount of data is collected in real-time to improve the performance of the system, and after training is completed the inner-layer DNN weights are updated in batch-updates. The key development in this work is that the output-layer DNN update law is augmented with CL-based terms to ensure that the output-layer DNN weight estimates converge to within a ball of their optimal values. A Lyapunov-based stability analysis is performed to ensure semi-global exponential convergence to an ultimate bound for the trajectory tracking errors and the output-layer DNN weight estimation errors. 
    more » « less