LiveGuard: Voice Liveness Detection via Wavelet Scattering Transform and Mel Spectrogram Scaling

Shan, Liqun; Zhang, Xingli; Hossen, Md Imran; Hei, Xiali

doi:10.1109/DSN64029.2025.00041

Citation Details

This content will become publicly available on June 23, 2026

LiveGuard: Voice Liveness Detection via Wavelet Scattering Transform and Mel Spectrogram Scaling

Voice-controlled interfaces are essential in modern smart devices, but they remain vulnerable to replay attacks that compromise voice authentication systems. Existing voice liveness detection methods often struggle to distinguish human speech from replayed audio. This paper introduces a novel approach, LiveGuard, utilizing wavelet scattering transform (WST) and Mel spectrogram scaling with a lightweight ResNet architecture to enhance voice liveness detection. WST captures robust hierarchical features, while Mel spectrogram scaling extracts fine-grained acoustic details, which the lightweight ResNet efficiently processes to identify live voice. Experimental results demonstrate accuracy improvements of 6% with WST and Mel spectrogram scaling, achieving a top accuracy of 97.17% on POCO dataset. Meanwhile, LiveGuard demonstrates superior performance on ASVspoof2019 and ASVspoof2021 benchmarks. It achieves the lowest equal error rate (EER) of 0.13%, and a min t-DCF of 0.00126 on ASVspoof2019, and an EER of 0.42% on ASVspoof2021, surpassing state-of-the-art methods. more »

Award ID(s):: 2231682 2117785 2229752 1946231

PAR ID:: 10647749

Author(s) / Creator(s):: Shan, Liqun ; Zhang, Xingli ; Hossen, Md Imran ; Hei, Xiali

Publisher / Repository:: IEEE

Date Published:: 2025-06-23

Page Range / eLocation ID:: 317 to 330

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 23, 2026
Conference Paper:
https://doi.org/10.1109/DSN64029.2025.00041

More Like this