Prosody Prediction from Syntactic, Lexical, and Word Embedding Features

Sloan, Rose; Akhtar, Syed Sarfaraz; Li, Bryan; Shrivastava, Ritvik; Gravano, Agustin; Hirschberg, Julia

doi:10.21437/SSW.2019-48

Citation Details

Prosody Prediction from Syntactic, Lexical, and Word Embedding Features

Accurate prosody prediction from text leads to more natural-sounding TTS. In this work, we employ a new set of fea- tures to predict ToBI pitch accent and phrase boundaries from text. We investigate a wide variety of text-based features, in- cluding many new syntactic features, several types of word em- beddings, co-reference features, LIWC features, and specificity information. We focus our work on the Boston Radio News Corpus, a ToBI-labeled corpus of relatively clean news broad- casts, but also test our classifiers on Audix, a smaller corpus of read news, and on the Columbia Games Corpus, a corpus of conversational speech, in order to test the applicability of our model in cross-corpus settings. Our results show strong per- formance on both tasks, as well as some promising results for cross-corpus applications of our models. more »

Award ID(s):: 1717680

PAR ID:: 10177402

Author(s) / Creator(s):: Sloan, Rose; Akhtar, Syed Sarfaraz; Li, Bryan; Shrivastava, Ritvik; Gravano, Agustin; Hirschberg, Julia

Date Published:: 2019-09-20

Journal Name:: 10th ISCA Speech Synthesis Workshop

Page Range / eLocation ID:: 269 to 274

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.21437/SSW.2019-48

More Like this