This paper introduces a method to estimate the direction of arrival of an acoustic signal based on finding maximum power in iteratively reduced regions of a spherical surface. A plane wave decomposition beamformer is used to produce power estimates at sparsely distributed points on the sphere. Iterating beam orientation based on the orientation of maximum energy produces accurate localization results. The method is tested using varying reverberation times, source-receiver distances, and angular separation of multiple sources and compared against a pseudo-intensity vector estimator. Results demonstrate that this method is suitable for integration into real-time telematic frameworks, especially in reverberant conditions.
more »
« less
Sparse Iterative Beamforming Using Spherical Microphone Arrays for Low-Latency Direction of Arrival Estimation in Reverberant Environments
Acoustic direction of arrival estimation methods allows positional information about sound sources to be transmitted over a network using minimal bandwidth. For these purposes,methods that prioritize low computational overhead and consistent accuracy under non-ideal conditions are preferred. The estimation method introduced in this paper uses a set of steered beams to estimate directional energy at sparsely distributed orientations around a spherical microphone array. By iteratively adjusting beam orientations based on the orientation of maximum energy, an accurate orientation estimate of a sound source may be produced with minimal computational cost. Incorporating conditions based on temporal smoothing and diffuse energy estimation further refines this process. Testing under simulated conditions indicates favorable accuracy under reverberation and source discrimination when compared with several other contemporary localization methods. Outcomes include an average localization error of less than 10◦ under 2 s of reverberation time (T60) and the potential to separate up to four sound sources under the same conditions. Results from testing in a laboratory environment demonstrate potential for integration into real-time frameworks.
more »
« less
- Award ID(s):
- 1909229
- PAR ID:
- 10324140
- Date Published:
- Journal Name:
- Journal of the Audio Engineering Society
- Volume:
- 69
- Issue:
- 12
- ISSN:
- 1549-4950
- Page Range / eLocation ID:
- 967-977
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Canlon Barbara (Ed.)The human auditory system can localize multiple sound sources using time, intensity, and frequency cues in the sound received by the two ears. Being able to spatially segregate the sources helps perception in a challenging condition when multiple sounds coexist. This study used model simulations to explore an algorithm for localizing multiple sources in azimuth with binaural (i.e., two) microphones. The algorithm relies on the “sparseness” property of daily signals in the time-frequency domain, and sound coming from different locations carrying unique spatial features will form clusters. Based on an interaural normalization procedure, the model generated spiral patterns for sound sources in the frontal hemifield. The model itself was created using broadband noise for better accuracy, because speech typically has sporadic energy at high frequencies. The model at an arbitrary frequency can be used to predict locations of speech and music that occurred alone or concurrently, and a classification algorithm was applied to measure the localization error. Under anechoic conditions, averaged errors in azimuth increased from 4.5° to 19° with RMS errors ranging from 6.4° to 26.7° as model frequency increased from 300 to 3000 Hz. The low-frequency model performance using short speech sound was notably better than the generalized cross-correlation model. Two types of room reverberations were then introduced to simulate difficult listening conditions. Model performance under reverberation was more resilient at low frequencies than at high frequencies. Overall, our study presented a spiral model for rapidly predicting horizontal locations of concurrent sound that is suitable for real-world scenarios.more » « less
-
Keypoints used for image matching often include an estimate of the feature scale and orientation. While recent work has demonstrated the advantages of using feature scales and orientations for relative pose estimation, relatively little work has considered their use for absolute pose estimation. We introduce minimal solutions for absolute pose from two oriented feature correspondences in the general case, or one scaled and oriented correspondence given a known vertical direction. Nowadays, assuming a known direction is not particularly restrictive as modern consumer devices, such as smartphones or drones, are equipped with Inertial Measurement Units (IMU) that provide the gravity direction by default. Compared to traditional absolute pose methods requiring three point correspondences, our solvers need a smaller minimal sample, reducing the cost and complexity of robust estimation. Evaluations on large-scale and public real datasets demonstrate the advantage of our methods for fast and accurate localization in challenging conditions. Code is available at https://github.com/danini/absolute-pose-from-oriented-and-scaled-features.more » « less
-
The vibrational response of an elastic panel to incident acoustic waves is determined by the direction-of-arrival (DOA) of the waves relative to the spatial structure of the panel's bending modes. By monitoring the relative modal excitations of a panel immersed in a sound field, the DOA of the source may be inferred. In reverberant environments, early acoustic reflections and the late diffuse acoustic field may obscure the DOA of incoming sound waves. Panel microphones may be especially susceptible to the effects of reverberation due to their large surface areas and long-decaying impulse responses. An investigation into the effect of reverberation on the accuracy of DOA estimation with panel microphones was made by recording wake-word utterances in eight spaces with reverberation times (RT60s) ranging from 0.27 to 3.00 s. The responses were used to train neural networks to estimate the DOA. Within ±5°, DOA estimation reliability was measured at 95.00% in the least reverberant space, decreasing to 78.33% in the most reverberant space, suggesting an inverse relationship between RT60 and DOA accuracy. Experimental results suggest that a system for estimating DOA with panel microphones can generalize to new acoustic environments by cross-training the system with data from multiple spaces with different RT60s.more » « less
-
The problem of sound source localization has attracted the interest of researchers from different disciplines ranging from biology to robotics and navigation. It is in essence an estimation problem trying to estimate the location of the sound source using the information available to sound receivers. It is common practice to design Bayesian estimators based on a dynamic model of the system. Nevertheless, in some practical situations, such a dynamic model may not be available in the case of a moving sound source and instead, some a priori information about the sound source may be known. This paper considers a case study of designing an estimator using available a priori information, along with measurement signals received from a bearing-only sensor, to track a moving sound source in two dimensions.more » « less
An official website of the United States government

