Learning Speaker-Listener Mutual Head Orientation by Leveraging HRTF and Voice Directivity on Headphones

Takawale, Harshvardhan; Roy, Nirupam

doi:10.1109/ICASSP48485.2024.10446588

Citation Details

Learning Speaker-Listener Mutual Head Orientation by Leveraging HRTF and Voice Directivity on Headphones

Estimation of a speaker’s direction and head orientation with binaural recordings can be a critical piece of information in many real-world applications with emerging ‘earable’ devices, including smart headphones and AR/VR headsets. However, it requires predicting the mutual head orientations of both the speaker and the listener, which is challenging in practice. This paper presents a system for jointly predict- ing speaker-listener head orientations by leveraging inherent human voice directivity and listener’s head-related transfer function (HRTF) as perceived by the ear-mounted microphones on the listener. We propose a convolution neural network model that, given binaural speech recording, can predict the orientation of both speaker and listener with re- spect to the line joining the two. The system builds on the core observation that the recordings from the left and right ears are differentially affected by the voice directivity as well as the HRTF. We also incorporate the fact that voice is more directional at higher frequencies compared to lower frequen- cies. Our proposed system achieves 2.5 degrees of 90th percentile error in the listener’s head orientation and 12.5 degrees of 90th percentile error for that of the speaker. more »

Award ID(s):: 2238433

PAR ID:: 10514754

Author(s) / Creator(s):: Takawale, Harshvardhan; Roy, Nirupam

Publisher / Repository:: IEEE

Date Published:: 2024-04-14

Journal Name:: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

ISBN:: 979-8-3503-4485-1

Page Range / eLocation ID:: 1171 to 1175

Subject(s) / Keyword(s):: Voice directivity HRTF head orientation voiced sounds auditory perception

Format(s):: Medium: X

Location:: Seoul, Korea, Republic of

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICASSP48485.2024.10446588

More Like this