skip to main content

Title: Pitchln: eavesdropping via intelligible speech reconstruction using non-acoustic sensor fusion
Despite the advent of numerous Internet-of-Things (IoT) applications, recent research demonstrates potential side-channel vulnerabilities exploiting sensors which are used for event and environment monitoring. In this paper, we propose a new side-channel attack, where a network of distributed non-acoustic sensors can be exploited by an attacker to launch an eavesdropping attack by reconstructing intelligible speech signals. Specifically, we present PitchIn to demonstrate the feasibility of speech reconstruction from non-acoustic sensor data collected offline across networked devices. Unlike speech reconstruction which requires a high sampling frequency (e.g., > 5 KHz), typical applications using non-acoustic sensors do not rely on richly sampled data, presenting a challenge to the speech reconstruction attack. Hence, PitchIn leverages a distributed form of Time Interleaved Analog-Digital-Conversion (TIADC) to approximate a high sampling frequency, while maintaining low per-node sampling frequency. We demonstrate how distributed TI-ADC can be used to achieve intelligibility by processing an interleaved signal composed of different sensors across networked devices. We implement PitchIn and evaluate reconstructed speech signal intelligibility via user studies. PitchIn has word recognition accuracy as high as 79%. Though some additional work is required to improve accuracy, our results suggest that eavesdropping using a fusion of non-acoustic sensors is a real and more » practical threat. « less
Authors:
; ;
Award ID(s):
1645759
Publication Date:
NSF-PAR ID:
10048437
Journal Name:
Proc. 16th ACM/IEEE Int'l Conference on Information Processing in Sensor Networks (IPSN'17)
Page Range or eLocation-ID:
181 - 192
Sponsoring Org:
National Science Foundation
More Like this
  1. Neural network applications have become popular in both enterprise and personal settings. Network solutions are tuned meticulously for each task, and designs that can robustly resolve queries end up in high demand. As the commercial value of accurate and performant machine learning models increases, so too does the demand to protect neural architectures as confidential investments. We explore the vulnerability of neural networks deployed as black boxes across accelerated hardware through electromagnetic side channels. We examine the magnetic flux emanating from a graphics processing unit’s power cable, as acquired by a cheap $3 induction sensor, and find that this signal betrays the detailed topology and hyperparameters of a black-box neural network model. The attack acquires the magnetic signal for one query with unknown input values, but known input dimensions. The network reconstruction is possible due to the modular layer sequence in which deep neural networks are evaluated. We find that each layer component’s evaluation produces an identifiable magnetic signal signature, from which layer topology, width, function type, and sequence order can be inferred using a suitably trained classifier and a joint consistency optimization based on integer programming. We study the extent to which network specifications can be recovered, and considermore »metrics for comparing network similarity. We demonstrate the potential accuracy of this side channel attack in recovering the details for a broad range of network architectures, including random designs. We consider applications that may exploit this novel side channel exposure, such as adversarial transfer attacks. In response, we discuss countermeasures to protect against our method and other similar snooping techniques.« less
  2. In this paper, a monotonic power side-channel attack (PSA) is proposed to analyze the security vulnerabilities of flash analog-to-digital converters (ADC), where the digital output of a flash ADC is determined by characterizing the monotonic relationship between the traces of the power consumed and the applied input signals. A novel technique that leverages clock phase division is proposed to secure the power side channel information of a 4-bit flash ADC. The proposed technique adds randomness to decorrelate the input signal from the given power trace as the execution phase of each comparator depends on a thermometer code computed from the previous seven clock cycles. The monotonic PSA is executed on both a secured and unsecured ADC, with results indicating 1.9 bits of information leakage from an unprotected ADC and no data leakage from a protected ADC as the bit-wise accuracy is approximately 50% when secured. The monotonic PSA is more effective at attacking a flash ADC architecture than either a convolutional neural network based PSA or a correlation template PSA. The secured ADC core occupies approximately 2% more area than a non-secure ADC in a 65 nm process, and provides a sampling frequency of up to 500 MHz at amore »supply voltage of 1.2 V. Index Terms—power side-channel, ADC,« less
  3. This article details the implementation of a novel passive structural health monitoring approach for damage detection in wind turbine blades using airborne sound. The approach utilizes blade-internal microphones to detect trends, shifts, or spikes in the sound pressure level of the blade cavity using a limited network of internally distributed airborne acoustic sensors, naturally occurring passive system excitation, and periodic measurement windows. A test campaign was performed on a utility-scale wind turbine blade undergoing fatigue testing to demonstrate the ability of the method for structural health monitoring applications. The preliminary audio signal processing steps used in the study, which were heavily influenced by those methods commonly utilized in speech-processing applications, are discussed in detail. Principal component analysis and K-means clustering are applied to the feature-space representation of the data set to identify any outliers (synonymous with deviations from the normal operation of the wind turbine blade) in the measurements. The performance of the system is evaluated based on its ability to detect those structural events in the blade that are identified by making manual observations of the measurements. The signal processing methods proposed within the article are shown to be successful in detecting structural and acoustic aberrations experienced by amore »full-scale wind turbine blade undergoing fatigue testing. Following the assessment of the data, recommendations are given to address the future development of the approach in terms of physical limitations, signal processing techniques, and machine learning options.« less
  4. Objectively differentiating patient mental states based on electrical activity, as opposed to overt behavior, is a fundamental neuroscience problem with medical applications, such as identifying patients in locked-in state vs. coma. Electroencephalography (EEG), which detects millisecond-level changes in brain activity across a range of frequencies, allows for assessment of external stimulus processing by the brain in a non-invasive manner. We applied machine learning methods to 26-channel EEG data of 24 fluent Deaf signers watching videos of sign language sentences (comprehension condition), and the same videos reversed in time (non-comprehension condition), to objectively separate vision-based high-level cognition states. While spectrotemporal parameters of the stimuli were identical in comprehension vs. non-comprehension conditions, the neural responses of participants varied based on their ability to linguistically decode visual data. We aimed to determine which subset of parameters (specific scalp regions or frequency ranges) would be necessary and sufficient for high classification accuracy of comprehension state. Optical flow, characterizing distribution of velocities of objects in an image, was calculated for each pixel of stimulus videos using MATLAB Vision toolbox. Coherence between optical flow in the stimulus and EEG neural response (per video, per participant) was then computed using canonical component analysis with NoiseTools toolbox. Peakmore »correlations were extracted for each frequency for each electrode, participant, and video. A set of standard ML algorithms were applied to the entire dataset (26 channels, frequencies from .2 Hz to 12.4 Hz, binned in 1 Hz increments), with consistent out-of-sample 100% accuracy for frequencies in .2-1 Hz range for all regions, and above 80% accuracy for frequencies < 4 Hz. Sparse Optimal Scoring (SOS) was then applied to the EEG data to reduce the dimensionality of the features and improve model interpretability. SOS with elastic-net penalty resulted in out-of-sample classification accuracy of 98.89%. The sparsity pattern in the model indicated that frequencies between 0.2–4 Hz were primarily used in the classification, suggesting that underlying data may be group sparse. Further, SOS with group lasso penalty was applied to regional subsets of electrodes (anterior, posterior, left, right). All trials achieved greater than 97% out-of-sample classification accuracy. The sparsity patterns from the trials using 1 Hz bins over individual regions consistently indicated frequencies between 0.2–1 Hz were primarily used in the classification, with anterior and left regions performing the best with 98.89% and 99.17% classification accuracy, respectively. While the sparsity pattern may not be the unique optimal model for a given trial, the high classification accuracy indicates that these models have accurately identified common neural responses to visual linguistic stimuli. Cortical tracking of spectro-temporal change in the visual signal of sign language appears to rely on lower frequencies proportional to the N400/P600 time-domain evoked response potentials, indicating that visual language comprehension is grounded in predictive processing mechanisms.« less
  5. Existing analog-signal side-channels, such as EM emanations, are a consequence of current-flow changes that are dependent on activity inside an electronic circuits. In this paper, we introduce a new class of side-channels that is a consequence of impedance changes in switching circuits, and we refer to it as an impedance-based side-channel. One example of such a side-channel is when digital logic activity causes incoming EM signals to be modulated as they are reflected (backscattered), at frequencies that depend on both the incoming EM signal and the circuit activity. This can cause EM interference or leakage of sensitive information, but it can also be leveraged for RFID tag design. In this paper, we first introduce a new class of side-channels that is a consequence of impedance differences in switching circuits, and we refer to it as an impedance-based side-channel. Then, we demonstrate that the impedance difference between transistor gates in the high-state and in the low-state changes the radar cross section (RCS) and modulates the backscattered signal. Furthermore, we have investigated the possibility of implementing the proposed RFID on ASIC for signal enhancement. Finally, we propose a digital circuit that can be used as a semi-passive RFID tag. To illustrate themore »adaptability of the proposed RFID, we have designed a variety of RFID applications across carrier frequencies at 5.8 GHz, 17.46 GHz, and 26.5 GHz to demonstrate flexible carrier frequency selection and bit configuration.« less