skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: THE SHAPE OF REMIXXXES TO COME: AUDIO TEXTURE SYNTHESIS WITH TIME-FREQUENCY SCATTERING
This article explains how to apply time–frequency scattering, a con- volutional operator extracting modulations in the time–frequency domain at different rates and scales, to the re-synthesis and manip- ulation of audio textures. After implementing phase retrieval in the scattering network by gradient backpropagation, we introduce scale-rate DAFx, a class of audio transformations expressed in the domain of time–frequency scattering coefficients. One example of scale-rate DAFx is chirp rate inversion, which causes each sonic event to be locally reversed in time while leaving the arrow of time globally unchanged. Over the past two years, our work has led to the creation of four electroacoustic pieces: FAVN; Modulator (Scat- tering Transform); Experimental Palimpsest; Inspection (Maida Vale Project) and Inspection II; as well as XAllegroX (Hecker Scat- tering.m Sequence), a remix of Lorenzo Senni’s XAllegroX, released by Warp Records on a vinyl entitled The Shape of RemiXXXes to Come.  more » « less
Award ID(s):
1633206
PAR ID:
10118932
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 22nd International Conference on Digital Audio Effects (DAFx-19)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Brenner, Susan (Ed.)
    This paper proposes a frequency-time hybrid solver for the time-dependent wave equation in two-dimensionalinterior spatial domains. The approach relies on four main elements, namely, (1) A multiple scattering strategy that decomposes a giveninteriortime-domain problem into a sequence oflimited-durationtime-domain problems of scattering by overlapping open arcs, each one of which is reduced (by means of the Fourier transform) to a sequence ofHelmholtz frequency-domain problems; (2) Boundary integral equations on overlapping boundary patches for the solution of the frequency-domain problems in point (1); (3) A smooth“Time-windowing and recentering”methodology that enables both treatment of incident signals of long duration and long time simulation; and, (4) A Fourier transform algorithm that delivers numerically dispersionless,spectrally-accurate time evolutionfor given incident fields. By recasting the interior time-domain problem in terms of a sequence of open-arc multiple scattering events, the proposed approach regularizes the full interior frequency domain problem—which, if obtained by either Fourier or Laplace transformation of the corresponding interior time-domain problem, must encapsulate infinitely many scattering events, giving rise to non-uniqueness and eigenfunctions in the Fourier case, and ill conditioning in the Laplace case. Numerical examples are included which demonstrate the accuracy and efficiency of the proposed methodology. 
    more » « less
  2. Audio-based human activity recognition (HAR) is very popular because many human activities have unique sound signatures that can be detected using machine learning (ML) approaches. These audio-based ML HAR pipelines often use common featurization techniques, such as extracting various statistical and spectral features by converting time domain signals to the frequency domain (using an FFT) and using them to train ML models. Some of these approaches also claim privacy benefits by preventing the identification of human speech. However, recent deep learning-based automatic speech recognition (ASR) models pose new privacy challenges to these featurization techniques. In this paper, we systematically evaluate various featurization approaches for audio data, assessing their privacy risks through metrics like speech intelligibility (PER and WER) while considering the utility tradeoff in terms of ML-based activity recognition accuracy. Our findings reveal the susceptibility of these approaches to speech content recovery when exposed to recent ASR models, especially under re-tuning or retraining conditions. Notably, fine-tuned ASR models achieved an average Phoneme Error Rate (PER) of 39.99% and Word Error Rate (WER) of 44.43% in speech recognition for these approaches. To overcome these privacy concerns, we propose Kirigami, a lightweight machine learning-based audio speech filter that removes human speech segments reducing the efficacy of ASR models (70.48% PER and 101.40% WER) while also maintaining HAR accuracy (76.0% accuracy). We show that Kirigami can be implemented on common edge microcontrollers with limited computational capabilities and memory, providing a path to deployment on a variety of IoT devices. Finally, we conducted a real-world user study and showed the robustness of Kirigami on a laptop and an ARM Cortex-M4F microcontroller under three different background noises. 
    more » « less
  3. We present an innovative approach to auto-annotate Expert Defined Linguistic Features (EDLFs) as subsequences in audio time series to improve audio deepfake discernment. In our prior work, these linguistic features – namely pitch, pause, breath, consonant release bursts, and overall audio quality, labeled by experts on the entire audio signal – have been shown to improve detection of audio deepfakes with AI algorithms. We now expand our approach to pilot a way to auto annotate subsequences in the time series that correspond to each EDLF. We developed an ensemble of discords, i.e. anomalies in time series, detected using matrix profiles across multiple discord lengths to identify multiple types of EDLFs. Working closely with linguistic experts, we evaluated where discords overlapped with EDLFs in the audio signal data. Our ensemble method to detect discords across multiple discord lengths achieves much higher accuracy than using individual discord lengths to detect EDLFs. With this approach and domain validation we establish the feasibility of using time series subsequences to capture EDLFs to supplement annotation by domain experts, for improved audio deepfake detection. 
    more » « less
  4. ABSTRACT Magnetars are the most promising progenitors of fast radio bursts (FRBs). Strong radio waves propagating through the magnetar wind are subject to non-linear effects, including modulation/filamentation instabilities. We derive the dispersion relation for modulations of strong waves propagating in magnetically dominated pair plasmas focusing on dimensionless strength parameters a0 ≲ 1, and discuss implications for FRBs. As an effect of the instability, the FRB-radiation intensity develops sheets perpendicular to the direction of the wind magnetic field. When the FRB front expands outside the radius where the instability ends, the radiation sheets are scattered due to diffraction. The FRB-scattering time-scale depends on the properties of the magnetar wind. In a cold wind, the typical scattering time-scale is τsc ∼  $$\mu$$s–ms at the frequency $$\nu \sim 1\, {\rm GHz}$$. The scattering time-scale increases at low frequencies, with the scaling τsc ∝ ν−2. The frequency-dependent broadening of the brightest pulse of FRB 181112 is consistent with this scaling. From the scattering time-scale of the pulse, one can estimate that the wind Lorentz factor is larger than a few tens. In a warm wind, the scattering time-scale can approach $$\tau _{\rm sc}\sim \, {\rm ns}$$. Then scattering produces a frequency modulation of the observed intensity with a large bandwidth, $$\Delta \nu \sim 1/\tau _{\rm sc}\gtrsim 100\, {\rm MHz}$$. Broad-band frequency modulations observed in FRBs could be due to scattering in a warm magnetar wind. 
    more » « less
  5. Abstract There is growing evidence that prey perceive the risk of predation and alter their behavior in response, resulting in changes in spatial distribution and potential fitness consequences. Previous approaches to mapping predation risk across a landscape quantify predator space use to estimate potential predator‐prey encounters, yet this approach does not account for successful predator attack resulting in prey mortality. An exception is a prey kill site that reflects an encounter resulting in mortality, but obtaining information on kill sites is expensive and requires time to accumulate adequate sample sizes.We illustrate an alternative approach using predator scat locations and their contents to quantify spatial predation risk for elk(Cervus canadensis) from multiple predators in the Rocky Mountains of Alberta, Canada. We surveyed over 1300 km to detect scats of bears (Ursus arctos/U.americanus), cougars (Puma concolor), coyotes (Canis latrans), and wolves (C.lupus). To derive spatial predation risk, we combined predictions of scat‐based resource selection functions (RSFs) weighted by predator abundance with predictions that a predator‐specific scat in a location contained elk. We evaluated the scat‐based predictions of predation risk by correlating them to predictions based on elk kill sites. We also compared scat‐based predation risk on summer ranges of elk following three migratory tactics for consistency with telemetry‐based metrics of predation risk and cause‐specific mortality of elk.We found a strong correlation between the scat‐based approach presented here and predation risk predicted by kill sites and (r = .98,p < .001). Elk migrating east of the Ya Ha Tinda winter range were exposed to the highest predation risk from cougars, resident elk summering on the Ya Ha Tinda winter range were exposed to the highest predation risk from wolves and coyotes, and elk migrating west to summer in Banff National Park were exposed to highest risk of encountering bears, but it was less likely to find elk in bear scats than in other areas. These patterns were consistent with previous estimates of spatial risk based on telemetry of collared predators and recent cause‐specific mortality patterns in elk.A scat‐based approach can provide a cost‐efficient alternative to kill sites of quantifying broad‐scale, spatial patterns in risk of predation for prey particularly in multiple predator species systems. 
    more » « less