skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Improving the Robustness of DTW to Global Time Warping Conditions in Audio Synchronization
Dynamic time warping estimates the alignment between two sequences and is designed to handle a variable amount of time warping. In many contexts, it performs poorly when confronted with two sequences of different scale, in which the average slope of the true alignment path in the pairwise cost matrix deviates significantly from one. This paper investigates ways to improve the robustness of DTW to such global time warping conditions, using an audio–audio alignment task as a motivating scenario of interest. We modify a dataset commonly used for studying audio–audio synchronization in order to construct a benchmark in which the global time warping conditions are carefully controlled, and we evaluate the effectiveness of several strategies designed to handle global time warping. Among the strategies tested, there is a clear winner: performing sequence length normalization via downsampling before invoking DTW. This method achieves the best alignment accuracy across a wide range of global time warping conditions, and it maintains or reduces the runtime compared to standard usages of DTW. We present experiments and analyses to demonstrate its effectiveness in both controlled and realistic scenarios.  more » « less
Award ID(s):
2144050
PAR ID:
10520547
Author(s) / Creator(s):
; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Applied Sciences
Volume:
14
Issue:
4
ISSN:
2076-3417
Page Range / eLocation ID:
1459
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Alignment algorithms like DTW and subsequence DTW assume specific boundary conditions on where an alignment path can begin and end in the cost matrix. In practice, the boundary conditions may not be known a priori or may not satisfy such strict assumptions. This paper introduces an alignment algorithm called FlexDTW that is designed to handle a wide range of boundary conditions. FlexDTW allows alignment paths to start anywhere on the bottom or left edge of the cost matrix (adjacent to the origin) and to end anywhere on the top or right edge. In order to properly compare paths of very different lengths, we use a goodness measure that normalizes the cumulative path cost by the path length. The key insight of FlexDTW is that the Manhattan length of a path can be computed by simply knowing the starting point of the path, which can be computed recursively during dynamic programming. We artificially generate a suite of 16 benchmarks based on the Chopin Mazurka dataset in order to characterize audio alignment performance under a variety of boundary conditions. We show that FlexDTW has consistently strong performance that is comparable or better than commonly used alignment algorithms, and it is the only system with strong performance in some boundary conditions. 
    more » « less
  2. DTW calculates the similarity or alignment between two signals, subject to temporal warping. However, its computational complexity grows exponentially with the number of time-series. Although there have been algorithms developed that are linear in the number of time-series, they are generally quadratic in time-series length. The exception is generalized time warping (GTW), which has linear computational cost. Yet, it can only identify simple time warping functions. There is a need for a new fast, high-quality multisequence alignment algorithm. We introduce trainable time warping (TTW), whose complexity is linear in both the number and the length of time-series. TTW performs alignment in the continuoustime domain using a sinc convolutional kernel and a gradient-based optimization technique. We compare TTW and GTW on S5 UCR datasets in time-series averaging and classification. TTW outperforms GTW on 67.1% of the datasets for the averaging tasks, and 61.2% of the datasets for the classification tasks. 
    more » « less
  3. Dynamic Time Warping (DTW) is widely used as a similarity measure in various domains. Due to its invariance against warping in the time axis, DTW provides more meaningful discrepancy measurements between two signals than other distance measures. In this paper, we propose a novel component in an artificial neural network. In contrast to the previous successful usage of DTW as a loss function, the proposed framework leverages DTW to obtain a better feature extraction. For the first time, the DTW loss is theoretically analyzed, and a stochastic backpropogation scheme is proposed to improve the accuracy and efficiency of the DTW learning. We also demonstrate that the proposed framework can be used as a data analysis tool to perform data decomposition. 
    more » « less
  4. This paper investigates an ordered partial matching alignment problem, in which the goal is to align two sequences in the presence of potentially non-matching regions. We propose a novel parameter-free dynamic programming alignment method called hidden state time warping that allows an alignment path to switch between two different planes: a “visible” plane corresponding to matching sections and a “hidden” plane corresponding to non-matching sections. By defining two distinct planes, we can allow different types of time warping in each plane (e.g., imposing a maximum warping factor in matching regions while allowing completely unconstrained movements in non-matching regions). The resulting algorithm can determine the optimal continuous alignment path via dynamic programming, and the visible plane induces a (possibly) discontinuous alignment path containing matching regions. We show that this approach outperforms existing parameter-free methods on two different partial matching alignment problems involving speech and music. 
    more » « less
  5. While it remains uncertain whether excursions in the stable carbon isotopic composition of Ediacaran marine carbonate (δ13Ccarb) represent globally synchronous events (or a direct measure of ocean carbon cycling), the absence of widely distributed and readily preservable fauna, and the presence of several iconic carbon isotope excursions (CIEs), has sustained δ13Ccarb correlation as the primary means to establish relative time relationships for Ediacaran successions. Here we present an Ediacaran global δ13Ccarb composite built with a dynamic time warping (DTW) time-normalization algorithm that generates libraries of least-squares alignments between chemostratigraphic records of unequal length and distinct sediment accumulation rates. When developing a δ13Ccarb composite for each of 16 globally distributed Ediacaran paleo-depositional regions, we selected high Pearson r alignments that conformed with published geological guidance about the correlation of constituent sections. When applying DTW to align these regional algorithmic composites into one global δ13Ccarb stack, we selected alignments that allied the excursions that field workers have established (or speculated) are the Marinoan cap carbonate excursion, the Shuram excursion, and/or the basal Cambrian excursion. There are strengths and weaknesses to making explicit the temporal relationships (point-to-point correspondences) often left implicit in visual correlation. One strength is to extrapolate depositional ages by means of isotopic correlation, and here we explored this with a Bayesian Markov chain Monte Carlo age model that predicts a median age, and uncertainty, for every carbonate stratum in the global Ediacaran δ13Ccarb composite. Yet, one must caution against a false accuracy that can arise from selecting one alignment among many possibilities––the likelihood that time-uncertain time series can be stretched and squeezed into one unequivocal alignment is low. Thus, while these alignments are grounded in the expert assessment of the field worker, this global Ediacaran δ13Ccarb–Bayesian age model should be viewed as a working hypothesis to enrich, but not arbitrate, discussions of the correlation, synchrony, and completeness of Ediacaran successions. 
    more » « less