Microphone identification addresses the challenge of identifying the microphone signature from the recorded signal. An audio recording system (consisting of microphone, A/D converter, codec, etc.) leaves its unique traces in the recorded signal. Microphone system can be modeled as a linear time invariant system. The impulse response of this system is convoluted with the audio signal which is recorded using “the” microphone. This paper makes an attempt to identify "the" microphone from the frequency response of the microphone. To estimate the frequency response of a microphone, we employ sine sweep method which is independent of speech characteristics. Sinusoidal signals of increasing frequencies are generated, and subsequently we record the audio of each frequency. Detailed evaluation of sine sweep method shows that the frequency response of each microphone is stable. A neural network based classifier is trained to identify the microphone from recorded signal. Results show that the proposed method achieves microphone identification having 100% accuracy.
more »
« less
HPSpeech: Silent Speech Interface for Commodity Headphones
We present HPSpeech, a silent speech interface for commodity headphones. HPSpeech utilizes the existing speakers of the headphones to emit inaudible acoustic signals. The movements of the temporomandibular joint (TMJ) during speech modify the reflection pattern of these signals, which are captured by a microphone positioned inside the headphones. To evaluate the performance of HPSpeech, we tested it on two headphones with a total of 18 participants. The results demonstrated that HPSpeech successfully recognized 8 popular silent speech commands for controlling the music player with an accuracy over 90%. While our tests use modified commodity hardware (both with and without active noise cancellation), our results show that sensing the movement of the TMJ could be as simple as a firmware update for ANC headsets which already include a microphone inside the hear cup. This leaves us to believe that this technique has great potential for rapid deployment in the near future. We further discuss the challenges that need to be addressed before deploying HPSpeech at scale.
more »
« less
- Award ID(s):
- 2239569
- PAR ID:
- 10492815
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- The ACM International Symposium on Wearable Computing (ISWC'23)
- ISBN:
- 9798400701993
- Page Range / eLocation ID:
- 60 to 65
- Format(s):
- Medium: X
- Location:
- Cancun, Quintana Roo Mexico
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the problem of separating speech from several talkers in background noise using a fixed microphone array and a set of wearable devices. Wearable devices can provide reliable information about speech from their wearers, but they typically cannot be used directly for multichannel source separation due to network delay, sample rate offsets, and relative motion. Instead, the wearable microphone signals are used to compute the speech presence probability for each talker at each time-frequency index. Those parameters, which are robust against small sample rate offsets and relative motion, are used to track the second-order statistics of the speech sources and background noise. The fixed array then separates the speech signals using an adaptive linear time-varying multichannel Wiener filter. The proposed method is demonstrated using real-room recordings from three human talkers with binaural earbud microphones and an eight-microphone tabletop array.more » « less
-
The microphone systems employed by smart devices such as cellphones and tablets require case penetrations that leave them vulnerable to environmental damage. A structural sensor mounted on the back of the display screen can be employed to record audio by capturing the bending vibration signals induced in the display panel by an incident acoustic wave - enabling a functional microphone on a fully sealed device. Distributed piezoelectric sensing elements and low-noise accelerometers were bonded to the surfaces of several different panels and used to record acoustic speech signals. The quality of the recorded signals was assessed using the speech transmission index, and the recordings were transcribed to text using an automatic speech recognition system. Although the quality of the speech signals recorded by the piezoelectric sensors was reduced compared to the quality of speech recorded by the accelerometers, the word-error-rate of each transcription increased only by approximately 2% on average, suggesting that distributed piezoelectric sensors can be used as a low-cost surface microphone for smart devices that employ automatic speech recognition. A method of crosstalk cancellation was also implemented to enable the simultaneous recording and playback of audio signals by an array of piezoelectric elements and evaluated by the measured improvement in the recording’s signal-to-interference ratio.more » « less
-
Sensing movements and gestures inside the oral cavity has been a long-standing challenge for the wearable research community. This paper introduces EchoNose, a novel nose interface that explores a unique sensing approach to recognize gestures related to mouth, breathing, and tongue by analyzing the acoustic signal reflections inside the nasal and oral cavities. The interface incorporates a speaker and a microphone placed at the nostrils, emitting inaudible acoustic signals and capturing the corresponding reflections. These received signals were processed using a customized data processing and machine learning pipeline, enabling the distinction of 16 gestures involving speech, tongue, and breathing. A user study with 10 participants demonstrates that EchoNose achieves an average accuracy of 93.7% in recognizing these 16 gestures. Based on these promising results, we discuss the potential opportunities and challenges associated with applying this innovative nose interface in various future applications.more » « less
-
Abstract Silent speech interfaces offer an alternative and efficient communication modality for individuals with voice disorders and when the vocalized speech communication is compromised by noisy environments. Despite the recent progress in developing silent speech interfaces, these systems face several challenges that prevent their wide acceptance, such as bulkiness, obtrusiveness, and immobility. Herein, the material optimization, structural design, deep learning algorithm, and system integration of mechanically and visually unobtrusive silent speech interfaces are presented that can realize both speaker identification and speech content identification. Conformal, transparent, and self‐adhesive electromyography electrode arrays are designed for capturing speech‐relevant muscle activities. Temporal convolutional networks are employed for recognizing speakers and converting sensing signals into spoken content. The resulting silent speech interfaces achieve a 97.5% speaker classification accuracy and 91.5% keyword classification accuracy using four electrodes. The speech interface is further integrated with an optical hand‐tracking system and a robotic manipulator for human‐robot collaboration in both assembly and disassembly processes. The integrated system achieves the control of the robot manipulator by silent speech and facilitates the hand‐over process by hand motion trajectory detection. The developed framework enables natural robot control in noisy environments and lays the ground for collaborative human‐robot tasks involving multiple human operators.more » « less
An official website of the United States government

