skip to main content


Title: Robust Detection of Machine-induced Audio Attacks in Intelligent Audio Systems with Microphone Array
Award ID(s):
2114220
NSF-PAR ID:
10358838
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM SIGSAC Conference on Computer and Communications Security
Page Range / eLocation ID:
1884 to 1899
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the expansion of sensor nodes to newer avenues of technologies, such as the Internet of things (IoT), internet of bodies (IoB), augmented reality (AR), and mixed reality, the demand to support high-speed operations, such as audio and video, with a minimal increase in power consumption is gaining much traction. In this work, we focus on these nodes operating in audio-based AR (AAR) and explore the opportunity of supporting audio at a low power budget. For sensor nodes, communicating one bit of data usually consumes significantly higher power than the power associated with sensing and processing/computing one data bit. Compressing the number of communication bits at the expense of a few computation cycles considerably reduces the overall power consumption of the nodes. Audio codecs such as AAC and LDAC that currently perform compression and decompression of audio streams burn significant power and create a floor to the minimum power possible in these applications. Compressive sensing (CS), a powerful mathematical tool for compression, is often used in physiological signal sensing, such as EEG and ECG, and it can offer a promising low-power alternative to audio codecs. We introduce a new paradigm of using the CS-based approach to realize audio compression that can function as a new independent technique or augment the existing codecs for a higher level of compression. This work, CS-Audio, fabricated in TSMC 65-nm CMOS technology, presents the first CS-based compression, equipped with an ON-chip DWT sparsifier for non-sparse audio signals. The CS design, realized in a pipelined architecture, achieves high data rates and enables a wake-up implementation to bypass computation for insignificant input samples, reducing the power consumption of the hardware. The measurement results demonstrate a 3X-15X reduction in transmitted audio data without a perceivable degradation of audio quality, as indicated by the perceptual evaluation of audio quality mean opinion score (PEAQ MOS) >1.5. The hardware consumes 238 μW power at 0.65 V and 15 Mbps, which is (~20X-40X) lower than audio codecs. 
    more » « less
  2. Apollo 11 was the first manned space mission to successfully bring astronauts to the Moon and return them safely. As part of NASA’s goal in assessing team and mission success, all voice communications within mission control, astronauts, and support staff were captured using a multichannel analog system, which until recently had never been made available. More than 400 personnel served as mission specialists/support who communicated across 30 audio loops, resulting in 9,000+ h of data. It is essential to identify each speaker’s role during Apollo and analyze group communication to achieve a common goal. Manual annotation is costly, so this makes it necessary to determine robust speaker identification and tracking methods. In this study, a subset of 100hr derived from the collective 9,000hr of the Fearless Steps (FSteps) Apollo 11 audio data were investigated, corresponding to three critical mission phases: liftoff, lunar landing, and lunar walk. A speaker recognition assessment is performed on 140 speakers from a collective set of 183 NASA mission specialists who participated, based on sufficient training data obtained from 5 (out of 30) mission channels. We observe that SincNet performs the best in terms of accuracy and F score achieving 78.6% accuracy. Speaker models trained on specific phases are also compared with each other to determine if stress, g-force/atmospheric pressure, acoustic environments, etc., impact the robustness of the models. Higher performance was obtained using i-vector and x-vector systems for phases with limited data, such as liftoff and lunar walk. When provided with a sufficient amount of data (lunar landing phase), SincNet was shown to perform the best. This represents one of the first investigations on speaker recognition for massively large team-based communications involving naturalistic communication data. In addition, we use the concept of “Where’s Waldo?” to identify key speakers of interest (SOIs) and track them over the complete FSteps audio corpus. This additional task provides an opportunity for the research community to transition the FSteps collection as an educational resource while also serving as a tribute to the “heroes behind the heroes of Apollo.” 
    more » « less
  3. The transformation of engineering culture towards inclusion is a key objective in the retention and professionalization of a diverse engineering workforce. Faculty are key stakeholders impacting that inclusion because of their prominent role in shaping students’ underrepresented, marginalized, and/or hidden identities and core experiences in engineering classrooms. Yet, many faculty are not provided with practicable resources and training that can enrich their knowledge, empathy, and understanding of students’ diverse and marginalized experiences that differ from their own. This lack of resources has slowed the transformation of engineering culture and provides an opportunity for practical impact by researchers and faculty developers. However, the topic of developing inclusive culture remains understudied and has evaded traditional approaches to education research. Quantitative approaches can broadly identify the presence of marginalization or inclusion, but they lack the nuance to enhance a reader’s inclusive understanding. In contrast, qualitative and narrative-based approaches provide rich accounts of marginalized experiences and perspectives, but do not typically reach a broad audience of technical engineering faculty. Thus, these accounts are often disseminated to faculty and researchers already interested and invested in broadening participation, perpetuating a cycle of “preaching to the choir”. In the Audio for Inclusion project, we answer BPE’s call for innovative methods that increase research impact on broadening participation outcomes by proposing a novel audio narrative dissemination approach to foster inclusive understandings for engineering faculty. Specifically, we ask the following research questions: ● What marginalized student narratives related to identity and agency are present in engineering educational culture? ● How does hearing these narratives impact faculty perspectives of diversity and inclusion in engineering classrooms? This interactive poster presents the student audio narratives developed so far and overviews the entire project. 
    more » « less