Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech

O'Reilly, P; Pardo, B; Jin, Z; Su, J

Citation Details

This content will become publicly available on April 26, 2026

Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech

In the audio modality, state-of-the-art watermarking methods leverage deep neural networks to allow the embedding of human-imperceptible signatures in generated audio. The ideal is to embed signatures that can be detected with highaccuracy when the watermarked audio is altered via compression, filtering, or other transformations. Existing audio watermarking techniques operate in a post-hoc manner, manipulating “low-level” features of audio recordings after generation (e.g. through the addition of a low-magnitude watermark signal). We show that this post-hoc formulation makes existing audio watermarks vulnerable to transformation-based removal attacks. Focusing on speech audio, we (1) unify and extend existing evaluations of the effect of audio transformations on watermark detectability, and (2) demonstrate that state-of-the-art post-hoc audio watermarks can be removed with no knowledge of the watermarking scheme and minimal degradation in audio quality more »

Award ID(s):: 2222369

PAR ID:: 10638270

Author(s) / Creator(s):: O'Reilly, P; Pardo, B; Jin, Z; Su, J

Publisher / Repository:: The 1st Workshop on GenAI Watermarking (WMARK), collocated with ICLR 2025

Date Published:: 2025-04-26

Subject(s) / Keyword(s):: watermarking generative AI audio speech

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on April 26, 2026
Conference Paper:
The DOI is not currently available.

More Like this