Data security plays a crucial role in all areas of data transmission, processing, and storage. This paper considers security in eavesdropping attacks over wireless communication links in aeronautical telemetry systems. Data streams in these systems are often encrypted by traditional encryption algorithms such as the Advanced Encryption Standard (AES). Here, we propose a secure coding technique for the integrated Network Enhanced Telemetry (iNET) communications system that can be coupled with modern encryption schemes. We consider a wiretap scenario where there are two telemetry links between a test article (TA) and a legitimate receiver, or ground station (GS). We show how these two links can be used to transmit both encrypted and unencrypted data streams while keeping both streams secure. A single eavesdropper is assumed who can tap into both links through its noisy channel. Since our scheme does not require encryption of the unencrypted data stream, the proposed scheme offers the ability to reduce the size of the required secret key while keeping the transmitted data secure.
more »
« less
Whisper: a unilateral defense against VoIP traffic re-identification attacks
Encrypted voice-over-IP (VoIP) communication often uses variable bit rate (VBR) codecs to achieve good audio quality while minimizing bandwidth costs. Prior work has shown that encrypted VBR-based VoIP streams are vulnerable to re-identification attacks in which an attacker can infer attributes (e.g., the language being spoken, the identities of the speakers, and key phrases) about the underlying audio by analyzing the distribution of packet sizes. Existing defenses require the participation of both the sender and receiver to secure their VoIP communications. This paper presents Whisper, the first unilateral defense against re-identification attacks on encrypted VoIP streams. Whisper works by modifying the audio signal before it is encoded by the VBR codec, adding inaudible audio that either falls outside the fixed range of human hearing or is within the human audible range but is nearly imperceptible due to its low amplitude. By carefully inserting such noise, Whisper modifies the audio stream's distribution of packet sizes, significantly decreasing the accuracy of re-identification attacks. Its use is imperceptible by the (human) receiver. Whisper can be instrumented as an audio driver and requires no changes to existing (potentially closed-source) VoIP software. Since it is a unilateral defense, it can be applied at will by a user to enhance the privacy of its voice communications. We demonstrate that Whisper significantly reduces the accuracy of re-identification attacks and incurs only a small degradation in audio quality.
more »
« less
- Award ID(s):
- 1718498
- PAR ID:
- 10178638
- Date Published:
- Journal Name:
- Proceedings of the Annual Computer Security Applications Conference (ACSAC)
- Page Range / eLocation ID:
- 286 to 296
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In the audio modality, state-of-the-art watermarking methods leverage deep neural networks to allow the embedding of human-imperceptible signatures in generated audio. The ideal is to embed signatures that can be detected with highaccuracy when the watermarked audio is altered via compression, filtering, or other transformations. Existing audio watermarking techniques operate in a post-hoc manner, manipulating “low-level” features of audio recordings after generation (e.g. through the addition of a low-magnitude watermark signal). We show that this post-hoc formulation makes existing audio watermarks vulnerable to transformation-based removal attacks. Focusing on speech audio, we (1) unify and extend existing evaluations of the effect of audio transformations on watermark detectability, and (2) demonstrate that state-of-the-art post-hoc audio watermarks can be removed with no knowledge of the watermarking scheme and minimal degradation in audio qualitymore » « less
-
Automatic Speech Recognition (ASR) systems convert speech into text and can be placed into two broad categories: traditional and fully end-to-end. Both types have been shown to be vulnerable to adversarial audio examples that sound benign to the human ear but force the ASR to produce malicious transcriptions. Of these attacks, only the "psychoacoustic" attacks can create examples with relatively imperceptible perturbations, as they leverage the knowledge of the human auditory system. Unfortunately, existing psychoacoustic attacks can only be applied against traditional models, and are obsolete against the newer, fully end-to-end ASRs. In this paper, we propose an equalization-based psychoacoustic attack that can exploit both traditional and fully end-to-end ASRs. We successfully demonstrate our attack against real-world ASRs that include DeepSpeech and Wav2Letter. Moreover, we employ a user study to verify that our method creates low audible distortion. Specifically, 80 of the 100 participants voted in favor of all our attack audio samples as less noisier than the existing state-of-the-art attack. Through this, we demonstrate both types of existing ASR pipelines can be exploited with minimum degradation to attack audio quality.more » « less
-
Voice-controlled interfaces are essential in modern smart devices, but they remain vulnerable to replay attacks that compromise voice authentication systems. Existing voice liveness detection methods often struggle to distinguish human speech from replayed audio. This paper introduces a novel approach, LiveGuard, utilizing wavelet scattering transform (WST) and Mel spectrogram scaling with a lightweight ResNet architecture to enhance voice liveness detection. WST captures robust hierarchical features, while Mel spectrogram scaling extracts fine-grained acoustic details, which the lightweight ResNet efficiently processes to identify live voice. Experimental results demonstrate accuracy improvements of 6% with WST and Mel spectrogram scaling, achieving a top accuracy of 97.17% on POCO dataset. Meanwhile, LiveGuard demonstrates superior performance on ASVspoof2019 and ASVspoof2021 benchmarks. It achieves the lowest equal error rate (EER) of 0.13%, and a min t-DCF of 0.00126 on ASVspoof2019, and an EER of 0.42% on ASVspoof2021, surpassing state-of-the-art methods.more » « less
-
Adversarial machine learning research has recently demonstrated the feasibility to confuse automatic speech recognition (ASR) models by introducing acoustically imperceptible perturbations to audio samples. To help researchers and practitioners gain better understanding of the impact of such attacks, and to provide them with tools to help them more easily evaluate and craft strong defenses for their models, we present Adagio, the first tool designed to allow interactive experimentation with adversarial attacks and defenses on an ASR model in real time, both visually and aurally. Adagio incorporates AMR and MP3 audio compression techniques as defenses, which users can interactively apply to attacked audio samples. We show that these techniques, which are based on psychoacoustic principles, effectively eliminate targeted attacks, reducing the attack success rate from 92.5% to 0%. We will demonstrate Adagio and invite the audience to try it on the Mozilla Common Voice dataset. Code related to this paper is available at: https://github.com/nilakshdas/ADAGIO.more » « less
An official website of the United States government

