skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: MAVL: Multiresolution Analysis of Voice Localization.
The ability for a smart speaker to localize a user based on his/her voice opens the door to many new applications. In this paper, we present a novel system, MAVL, to localize human voice. It consists of three major components: (i) We first develop a novel multi-resolution analysis to estimate the Angle-of-Arrival (AoA) of time-varying low-frequency coherent voice signals coming from multiple propagation paths; (ii) We then automatically estimate the room structure by emitting acoustic signals and developing an improved 3D MUSIC algorithm; (iii) We finally re-trace the paths using the estimated AoA and room structure to localize the voice. We implement a prototype system using a single speaker and a uniform circular microphone array. Our results show that it achieves median errors of 1.49o and 3.33o for the top two AoAs estimation and achieves median localization errors of 0.31m in line-of-sight (LoS) cases and 0.47m in non-line-of-sight (NLoS) cases.  more » « less
Award ID(s):
2032125
PAR ID:
10227445
Author(s) / Creator(s):
Date Published:
Journal Name:
In Proc. of NSDI
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The ability for a smart speaker to localize a user based on his/her voice opens the door to many new applications. In this paper, we present a novel system, MAVL, to localize human voice. It consists of three major components: (i) We first develop a novel multi-resolution analysis to estimate the AoA of time-varying low-frequency coherent voice signals coming from multiple propagation paths; (ii) We then automatically estimate the room structure by emitting acoustic signals and developing an improved 3D MUSIC algorithm; (iii) We finally re-trace the paths using the estimated AoA and room structure to localize the voice. We implement a prototype system using a single speaker and a uniform circular microphone array. Our results show that it achieves median errors of 1.49 degree and 3.33 degree for the top two AoAs estimation and achieves median localization errors of 0.31m in line-of-sight (LoS) cases and 0.47m in non-line-of-sight (NLoS) cases. 
    more » « less
  2. In this work, we propose a novel approach for high accuracy user localization by merging tools from both millimeter wave (mmWave) imaging and communications. The key idea of the proposed solution is to leverage mmWave imaging to construct a high-resolution 3D image of the line-of-sight (LOS) and non-line-of-sight (NLOS) objects in the environment at one antenna array. Then, uplink pilot signaling with the user is used to estimate the angle-of-arrival and time-of- arrival of the dominant channel paths. By projecting the AoA and ToA information on the 3D mmWave images of the environment, the proposed solution can locate the user with a sub-centimeter accuracy. This approach has several gains. First, it allows accurate simultaneous localization and mapping (SLAM) from a single standpoint, i.e., using only one antenna array. Second, it does not require any prior knowledge of the surrounding environment. Third, it can locate NLOS users, even if their signals experience more than one reflection and without requiring an antenna array at the user. The approach is evaluated using a hardware setup and its ability to provide sub-centimeter localization accuracy is shown 
    more » « less
  3. null (Ed.)
    Voice assistants such as Amazon Echo (Alexa) and Google Home use microphone arrays to estimate the angle of arrival (AoA) of the human voice. This paper focuses on adding user localization as a new capability to voice assistants. For any voice command, we desire Alexa to be able to localize the user inside the home. The core challenge is two-fold: (1) accurately estimating the AoAs of multipath echoes without the knowledge of the source signal, and (2) tracing back these AoAs to reverse triangulate the user's location.We develop VoLoc, a system that proposes an iterative align-and-cancel algorithm for improved multipath AoA estimation, followed by an error-minimization technique to estimate the geometry of a nearby wall reflection. The AoAs and geometric parameters of the nearby wall are then fused to reveal the user's location. Under modest assumptions, we report localization accuracy of 0.44 m across different rooms, clutter, and user/microphone locations. VoLoc runs in near real-time but needs to hear around 15 voice commands before becoming operational. 
    more » « less
  4. In this paper, we develop the analytical framework for a novel Wireless signal-based Sensing capability for Robotics (WSR) by leveraging a robots’ mobility in 3D space. It allows robots to primarily measure relative direction, or Angle-of-Arrival (AOA), to other robots, while operating in non-line-of-sight unmapped environments and without requiring external infrastructure. We do so by capturing all of the paths that a wireless signal traverses as it travels from a transmitting to a receiving robot in the team, which we term as an AOA profile. The key intuition behind our approach is to enable a robot to emulate antenna arrays as it moves freely in 2D and 3D space. The small differences in the phase of the wireless signals are thus processed with knowledge of robots’ local displacement to obtain the profile, via a method akin to Synthetic Aperture Radar (SAR). The main contribution of this work is the development of (i) a framework to accommodate arbitrary 2D and 3D motion, as well as continuous mobility of both signal transmitting and receiving robots, while computing AOA profiles between them and (ii) a Cramer–Rao Bound analysis, based on antenna array theory, that provides a lower bound on the variance in AOA estimation as a function of the geometry of robot motion. This is a critical distinction with previous work on SAR-based methods that restrict robot mobility to prescribed motion patterns, do not generalize to the full 3D space, and require transmitting robots to be stationary during data acquisition periods. We show that allowing robots to use their full mobility in 3D space while performing SAR results in more accurate AOA profiles and thus better AOA estimation. We formally characterize this observation as the informativeness of the robots’ motion, a computable quantity for which we derive a closed form. All analytical developments are substantiated by extensive simulation and hardware experiments on air/ground robot platforms using 5 GHz WiFi. Our experimental results bolster our analytical findings, demonstrating that 3D motion provides enhanced and consistent accuracy, with a total AOA error of less than 10for 95% of trials. We also analytically characterize the impact of displacement estimation errors on the measured AOA and validate this theory empirically using robot displacements obtained using an off-the-shelf Intel Tracking Camera T265. Finally, we demonstrate the performance of our system on a multi-robot task where a heterogeneous air/ground pair of robots continuously measure AOA profiles over a WiFi link to achieve dynamic rendezvous in an unmapped, 300 m2environment with occlusions.

     
    more » « less
  5. null (Ed.)
    Localization is one of the most interesting topics related to the promising millimeter wave (mmWave) technology. In this paper, we investigate joint channel estimation and localization for a cooperative mmWave system with several receivers. Due to the strong line-of-sight path common to mmWave channels, one can localize the position of the user by exploiting the signal's angle-of-arrival (AoA). Leveraging a variational Bayesian approach, we obtain soft information about the AoA for each receiver. We then use the soft AoA information and geometrical constraints to localize the position of the user and further improve the channel estimation performance. Numerical results show that the proposed algorithm has centimeter-level localization accuracy for an outdoor scene. In addition, the proposed algorithm provides 1-3 dB of gain for channel estimation by exploiting the correlation among the receiver channels depending on the availability of prior information about the path loss model. 
    more » « less