摘要
该文描述了广播新闻语音库CUCBNC的构建过程。建设该语音库的目的是为了能将播音学相关知识应用到言语工程中。为此,通过解读播音学相关论述,提出了新的韵律特征,包括声音表达特征、语篇重音、意合群和复合韵律短语,并融入到CUCBNC语音库的韵律和文本标注规范中,目前已标注了约14h的语音数据。最后,通过观察相关韵律特征在标注数据中的统计分布,来检验融入了新特征的韵律标注规范是否合适。实验结果表明所提出的韵律特征是科学合理的。
This paper introduces CUCBNC,a broadcasting news corpus for applying broadcast announcing knowledge into speech engineering.The labeling process tokenized and integrated some knowledge from the broadcast announcing research into the annotation scheme.Therefore,some new prosody features are identified including voice expression characters,discourse stresses,meaning expression clusters and compound prosodic phrases.The annotated data includes about 14 hours of broadcasting speech announced by 2 women.The distribution of the new prosody features in the annotated data is analyzed to show that these prosodic features are reasonable.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2011年第9期1313-1316,共4页
Journal of Tsinghua University(Science and Technology)
基金
中国传媒大学211工程项目(21103010105)
中国传媒大学科研培育项目(P201012)
关键词
语音库
韵律标注
播音学
speech corpus
prosody annotation
broadcasting announcing