期刊文献+

基于声调核参数及DNN建模的韵律边界检测研究 被引量:1

Automatic Mandarin Prosody Boundary Detection Based on Tone Nucleus and DNN Model
下载PDF
导出
摘要 韵律边界对言语表达的自然度和可理解度有着重要作用。韵律建模也是语音合成、语音理解中的重要方面。该文从相邻声调的相互作用角度出发,提出基于深度神经网络(DNN)及声调核声学特征的汉语韵律边界检测方法。该方法首先采用声调核部分的声学特征来计算边界检测相关参数。然后,利用深度神经网络进行建模。作为对比,实验中采用了以整个音节的声学特征为输入特征的基线系统。结果表明,只使用调核部分声学特征的系统优于使用整个音节的系统,韵律边界检测正确率相对提高了4%,这表明该文提出的汉语韵律边界检测方法的有效性。 Prosody boundary plays an important role in naturalness and intelligibility of verbal expressions.Thus,prosody modeling is also an important aspect of speech synthesis and understanding.Focused on the interaction of adjacent tones,we propose a method of prosody boundary detection based on tone nucleus and DNN model.This method calculates the boundary-related parameters by applying the tone nucleus features.Then,the parameters are modeled by the deep neural network.For comparison,the baseline system chooses syllable the acoustic feature.The experimental results show a relative 4%improvement achieved by the proposed method.
出处 《中文信息学报》 CSCD 北大核心 2016年第6期35-39,48,共6页 Journal of Chinese Information Processing
基金 北京语言大学梧桐创新平台项目资助(中央高校基本科研业务费专项基金)(16PT05) 北京语言大学研究生创新基金资助项目(中央高校基本科研业务费专项资金)(16YCX163)
关键词 韵律边界建模 声调核 深度神经网络 prosody boundary modeling tone nucleus deep neural network
  • 相关文献

参考文献2

二级参考文献30

  • 1邵艳秋,韩纪庆,刘挺,赵永贞.自然风格言语的汉语句重音自动判别研究[J].声学学报,2006,31(3):203-210. 被引量:17
  • 2Wightman C W, Ostendorf M. Automatic labeling of prosodic patterns [J]. IEEE Transactions on Speech and Audio Processing, 1994, 2(4) : 469 - 481.
  • 3Chou F C, Tseng C Y, Lee L S. Automatic segmental and prosodic labeling of Mandarin speech database [C]// Proc of 5th International Conference on Spoken Language Processing. Sydney: Australian Speech Science and Technology Association, 1998:1263- 1266.
  • 4Ananthakrishnan S, Narayanan S S. An automatic prosody recognizer using a coupled multi stream acoustic model and a syntactic prosodic language model [C]// Proc of 30th International Conference on Acoustics, Speech, and Signal Processing. Philadelphia: Institute of Electrical and Electronics Engineers, 2005 : 269 - 272.
  • 5Huang J T, Hasegawa Johnson M, Shih C. Unsupervised Prosodic Break Detection in Mandarin Speech [C]// Proc of 4th International Conference on Speech Prosody. Campinas: International Speech Communication Association, 2008:165- 168.
  • 6LIU Fangzhou, JIA Huibin, TAO Jianhua. A maximum entropy based hierarchical model for automatic prosodic boundary labeling in Mandarin [C]// Proe of 6th International Symposium on Chinese Spoken Language Processing. Kunming: Institute of Electrical and Electronics Engineers Computer Society, 2008: 257- 260.
  • 7YANG Chenyu, LING Zhenhua, LU Heng, et al. Automatic phrase boundary labeling for Mandarin TTS corpus using context dependent HMM [C]// Proc of 7th International Symposium on Chinese Spoken Language Processing. Tainan, China: Institute of Engineers Computer Society, 20 Electrical and Electronics 10:374-377.
  • 8LING Zhenhua, QIN Long, LU Heng, et al. The USTC and iflytek speech synthesis systems for Blizzard Challenge 2007 [OL]. [2011 -02- 25]. http: //festvox. org/blizzard/bc2007/ blizzard_2007/full_ papers/blz3_017, pdf.
  • 9Kawahara H, Masuda Katsuse I, De C A. Restructuring speech representations using a pitch adaptive time-frequency smoothing and an instantaneous-frequency based F0 extraction: possible role of a repetitive structure in sounds [J]. Speech Communication, 1999, 27(3): 187-207.
  • 10Tokuda K, Masuko T, Miyazaki N, et al. Hidden Markov models based on multi space probability distribution for pitch pattern modeling [C]//Proc of 24th International Conference on Acoustics, Speech and Signal Processing. Phoenix: Institute of Electrical and Electronics Engineers, 1999: 229- 232.

共引文献3

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部