skip to main content

Title: MAVL: Multiresolution Analysis of Voice Localization.
The ability for a smart speaker to localize a user based on his/her voice opens the door to many new applications. In this paper, we present a novel system, MAVL, to localize human voice. It consists of three major components: (i) We first develop a novel multi-resolution analysis to estimate the Angle-of-Arrival (AoA) of time-varying low-frequency coherent voice signals coming from multiple propagation paths; (ii) We then automatically estimate the room structure by emitting acoustic signals and developing an improved 3D MUSIC algorithm; (iii) We finally re-trace the paths using the estimated AoA and room structure to localize the voice. We implement a prototype system using a single speaker and a uniform circular microphone array. Our results show that it achieves median errors of 1.49o and 3.33o for the top two AoAs estimation and achieves median localization errors of 0.31m in line-of-sight (LoS) cases and 0.47m in non-line-of-sight (NLoS) cases.
Authors:
Award ID(s):
2032125
Publication Date:
NSF-PAR ID:
10227445
Journal Name:
In Proc. of NSDI
Sponsoring Org:
National Science Foundation
More Like this
  1. The ability for a smart speaker to localize a user based on his/her voice opens the door to many new applications. In this paper, we present a novel system, MAVL, to localize human voice. It consists of three major components: (i) We first develop a novel multi-resolution analysis to estimate the AoA of time-varying low-frequency coherent voice signals coming from multiple propagation paths; (ii) We then automatically estimate the room structure by emitting acoustic signals and developing an improved 3D MUSIC algorithm; (iii) We finally re-trace the paths using the estimated AoA and room structure to localize the voice. Wemore »implement a prototype system using a single speaker and a uniform circular microphone array. Our results show that it achieves median errors of 1.49 degree and 3.33 degree for the top two AoAs estimation and achieves median localization errors of 0.31m in line-of-sight (LoS) cases and 0.47m in non-line-of-sight (NLoS) cases.« less
  2. In this work, we propose a novel approach for high accuracy user localization by merging tools from both millimeter wave (mmWave) imaging and communications. The key idea of the proposed solution is to leverage mmWave imaging to construct a high-resolution 3D image of the line-of-sight (LOS) and non-line-of-sight (NLOS) objects in the environment at one antenna array. Then, uplink pilot signaling with the user is used to estimate the angle-of-arrival and time-of- arrival of the dominant channel paths. By projecting the AoA and ToA information on the 3D mmWave images of the environment, the proposed solution can locate the usermore »with a sub-centimeter accuracy. This approach has several gains. First, it allows accurate simultaneous localization and mapping (SLAM) from a single standpoint, i.e., using only one antenna array. Second, it does not require any prior knowledge of the surrounding environment. Third, it can locate NLOS users, even if their signals experience more than one reflection and without requiring an antenna array at the user. The approach is evaluated using a hardware setup and its ability to provide sub-centimeter localization accuracy is shown« less
  3. In this paper we derive a new capability for robots to measure relative direction, or Angle-of-Arrival (AOA), to other robots, while operating in non-line-of-sight and unmapped environments, without requiring external infrastructure. We do so by capturing all of the paths that a WiFi signal traverses as it travels from a transmitting to a receiving robot in the team, which we term as an AOA profile. The key intuition behind our approach is to emulate antenna arrays in the air as a robot moves freely in 2D or 3D space. The small differences in the phase and amplitude of WiFi signalsmore »are thus processed with knowledge of a robots’ local displacements (often provided via inertial sensors) to obtain the profile, via a method akin to Synthetic Aperture Radar (SAR). The main contribution of this work is the development of i) a framework to accommodate arbitrary 2D and 3D trajectories, as well as continuous mobility of both transmitting and receiving robots, while computing AOA profiles between them and ii) an accompanying analysis that provides a lower bound on variance of AOA estimation as a function of robot trajectory geometry that is based on the Cramer Rao Bound and antenna array theory. This is a critical distinction with previous work on SAR that restricts robot mobility to prescribed motion patterns, does not generalize to the full 3D space, and/or requires transmitting robots to be static during data acquisition periods. In fact, we find that allowing robots to use their full mobility in 3D space while performing SAR, results in more accurate AOA profiles and thus better AOA estimation. We formally characterize this observation as the informativeness of the trajectory; a computable quantity for which we derive a closed form. All theoretical developments are substantiated by extensive simulation and hardware experiments on air/ground robot platforms. Our experimental results bolster our theoretical findings, demonstrating that 3D trajectories provide enhanced and consistent accuracy, with AOA error of less than 10 deg for 95% of trials. We also show that our formulation can be used with an off-the-shelf trajectory estimation sensor (Intel RealSense T265 tracking camera), for estimating the robots’ local displacements, and we provide theoretical as well as empirical results that show the impact of typical trajectory estimation errors on the measured AOA. Finally, we demonstrate the performance of our system on a multi-robot task where a heterogeneous air/ground pair of robots continuously measure AOA profiles over a WiFi link to achieve dynamic rendezvous in an unmapped, 300 square meter environment with occlusions.« less
  4. Voice assistants such as Amazon Echo (Alexa) and Google Home use microphone arrays to estimate the angle of arrival (AoA) of the human voice. This paper focuses on adding user localization as a new capability to voice assistants. For any voice command, we desire Alexa to be able to localize the user inside the home. The core challenge is two-fold: (1) accurately estimating the AoAs of multipath echoes without the knowledge of the source signal, and (2) tracing back these AoAs to reverse triangulate the user's location.We develop VoLoc, a system that proposes an iterative align-and-cancel algorithm for improved multipathmore »AoA estimation, followed by an error-minimization technique to estimate the geometry of a nearby wall reflection. The AoAs and geometric parameters of the nearby wall are then fused to reveal the user's location. Under modest assumptions, we report localization accuracy of 0.44 m across different rooms, clutter, and user/microphone locations. VoLoc runs in near real-time but needs to hear around 15 voice commands before becoming operational.« less
  5. Localization is one of the most interesting topics related to the promising millimeter wave (mmWave) technology. In this paper, we investigate joint channel estimation and localization for a cooperative mmWave system with several receivers. Due to the strong line-of-sight path common to mmWave channels, one can localize the position of the user by exploiting the signal's angle-of-arrival (AoA). Leveraging a variational Bayesian approach, we obtain soft information about the AoA for each receiver. We then use the soft AoA information and geometrical constraints to localize the position of the user and further improve the channel estimation performance. Numerical results showmore »that the proposed algorithm has centimeter-level localization accuracy for an outdoor scene. In addition, the proposed algorithm provides 1-3 dB of gain for channel estimation by exploiting the correlation among the receiver channels depending on the availability of prior information about the path loss model.« less