- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources3
- Resource Type
-
0003000000000000
- More
- Availability
-
21
- Author / Contributor
- Filter by Author / Creator
-
-
Thebaud, Thomas (3)
-
Dehak, Najim (2)
-
Busso, Carlos (1)
-
Garcia, Leibny Paola (1)
-
Goncalves, Lucas (1)
-
Hussein, Amir (1)
-
Jahan, Maliha (1)
-
Khudanpur, Sanjeev (1)
-
Moro-Velazquez, Laureano (1)
-
Mote, Pravin (1)
-
Naini, Abinay Reddy (1)
-
Salman, Ali N (1)
-
Sisman, Berrak (1)
-
Ulgen, Ismail R (1)
-
Velazquez, Laureano Moro (1)
-
Verma, Neha (1)
-
Villalba, Jesús (1)
-
Wiesner, Matthew (1)
-
Xiao, Cihan (1)
-
#Tyler Phillips, Kenneth E. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The Interspeech 2025 speech emotion recognition in natural istic conditions challenge builds on previous efforts to advance speech emotion recognition (SER) in real-world scenarios. The focus is on recognizing emotions from spontaneous speech, moving beyond controlled datasets. It provides a framework for speaker-independent training, development, and evaluation, with annotations for both categorical and dimensional tasks. The challenge attracted 93 research teams, whose models significantly improved state-of-the-art results over competitive baselines. This paper summarizes the challenge, focusing on the key outcomes. We analyze top-performing methods, emerging trends, and innovative directions. We highlight the effectiveness of combining foundational models based on audio and text to achieve robust SER systems. The competition website, with leaderboards, baseline code, and instructions, is available at: https://lab-msp.com/MSP-Podcast_Competition/IS2025/.more » « lessFree, publicly-accessible full text available August 17, 2026
-
Jahan, Maliha; Moro-Velazquez, Laureano; Thebaud, Thomas; Dehak, Najim; Villalba, Jesús (, 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU))Ensuring that technological advancements benefit all groups of people equally is crucial. The first step towards fairness is identifying existing inequalities. The naive comparison of group error rates may lead to wrong conclusions. We introduce a new method to determine whether a speaker verification system is fair toward several population subgroups. We propose to model miss and false alarm probabilities as a function of multiple factors, including the population group effects, e.g., male and female, and a series of confounding variables, e.g., speaker effects, language, nationality, etc. This model can estimate error rates related to a group effect without the influence of confounding effects. We experiment with a synthetic dataset where we control group and confounding effects. Our metric achieves significantly lower false positive and false negative rates w.r.t. baseline. We also experiment with VoxCeleb and NIST SRE21 datasets on different ASV systems and present our conclusions.more » « less
-
Hussein, Amir; Xiao, Cihan; Verma, Neha; Thebaud, Thomas; Wiesner, Matthew; Khudanpur, Sanjeev (, Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023))
An official website of the United States government
