期刊文献+

基于音节时间长度高斯拟合的汉语音节切分方法 被引量:5

Chinese speech segmentation method based on Gauss distribution of time spans of syllables
下载PDF
导出
摘要 研究汉语自然语音音节切分方法具有明显现实意义,比较准确的自然语音切分方法可以代替人工对一些拥有参照文本的语音进行标注。然而至今为止并没有完全准确的汉语语音音节切分方法。依据相同发音环境下汉语语音音节时间长度服从某种高斯分布和相邻语音音节之间存在短时能量波谷两个假设,提出了基于音节时间长度高斯拟合的汉语音节切分方法。对算法进行分析,根据初步切分短时能量波谷分散到各分语音段的特性,提出了简化算法,有效降低了该音节切分方法的时间复杂度。实验结果表明,音节切分准确度(与人工标注切分时间距离平方的均值)达到小数点后3位,在台式机Matlab环境下运算时间均不超过1 s,可以达到应用要求。 So far away,there is no accurate method for Chinese natural speech segmentation of syllables,which is meaningful in labeling speech with reference text instead of people. According to two hypotheses that time spans of Chinese syllables under the same pronunciation obey Gauss distribution and short-time energy valley exists between two adjacent syllables,Chinese speech segmentation method based on Gauss distribution of time spans of syllables was proposed. A simplified method based on distribution of energy valleys was given,which effectively reduced the time complexity of this speech segmentation method. The experimental results show that segmentation accuracy( mean square value of time spans between artificial labels and labels created by this method) achieve 10- 3and computing times are less than 1 s in Matlab of PC.
出处 《计算机应用》 CSCD 北大核心 2016年第5期1410-1414,1420,共6页 journal of Computer Applications
关键词 汉语 自然语音 音节切分 时间长度 波谷 高斯分布 Chinese natural speech speech segmentation time span valley Gauss distribution
  • 相关文献

参考文献13

  • 1TOLEDANO D T, GOMEZ L A H, GRANDE L V. Automatic phonetic segmentation[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(6):617-625.
  • 2WU Y J, KAWAI H, NI J, et al. Discriminative training and explicit duration modeling for HMM-based automatic segmentation[J]. Speech Communication, 2005, 47(3):397-410.
  • 3van HEMERT J P. Automatic segmentation of speech[J]. IEEE Transactions on Signal Processing, 1991, 39(4):1008-1012.
  • 4CHOU F C, TSENG C Y, LEE L S. A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese[J]. IEEE Transactions on Speech and Audio Processing, 2002, 10(7):481-494.
  • 5DU S S. Research on robust automatic segmentation of dialectal speech[D]. Beijing:University of Chinese Academy of Sciences, 2006:15-26.
  • 6HE K J. An automatic labeling system for broadcast news[D]. Beijing:Beijing University of Posts and Telecommunications, 2010:22-47.
  • 7HAN H. Research and realization of the automatic syllable marking algorithm for Chinese continuous speech[D]. Harbin:Harbin Institute of Technology, 2008:21-44.
  • 8LEE K S. MLP-based phone boundary refining for a TTS database[J]. IEEE Transactions on Audio, Speech and Language Processing, 2006, 14(3):981-989.
  • 9BROGNAUX S, DRUGMAN T. HMM-based speech segmentation:improvements of fully automatic approaches[J]. IEEE Transactions on Audio, Speech and Language Processing, 2016, 24(1):5-15.
  • 10廖文辉,刘炎.数据分析与SAS实验[M].北京:经济科学出版社,2010.

二级参考文献15

  • 1蔡莲红,崔丹丹,蔡锐.汉语普通话语音合成语料库TH-CoSS的建设和分析[J].中文信息学报,2007,21(2):94-99. 被引量:12
  • 2汤胜良,张士礼,张志平,吴玺宏,迟惠生.基于新闻联播语料库的语音合成系统//第八届全国人机语音通讯学术会议.北京,2005.
  • 3王天庆,李爱军.连续汉语语音识别语料库的设计//第6届全国现代语音学学术会议.天津,2003.
  • 4李爱军,殷治纲,王茂林,徐波,宗成庆.口语对话语音语料库CADCC和其语音研究//第5届现代语音学学术会议文集.北京,2001.
  • 5Tao Jianhua, Yu Jian, Kang Yongguo. An expressive mandarin speech eorpus//Proceedings of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques. Bali Island, Indonesia, 2005.
  • 6Wu Tian, Yang Yingchun, Wu Zhaohui, Li Dongdong. 2006 MASC: A speech corpus in mandarin for emotion analysis and affective speaker recognition//Proceedings of 2006 IEEE Odyssey--The Speaker and Language Recognition Workshop. San Juan, Puerto Rico, 2006.
  • 7Chou Fu-Chiang, Tseng Chiu-Yu, Lee Lin-Shan. A set of corpus-based text-to-speech synthesis technologies for mandarin Chinese. IEEE Transactions on Speech and Audio Processing, 2002, 10(7): 481-494.
  • 8Chou F C, Tseng C Y, Lee L S. Selection of waveform units for corpus-based mandarin speech synthesis based on decision trees and prosodic modification costs//Proceedings of the Eurospeech. Budapest, Hungary, 1999.
  • 9Wang H C, Seide F, Tseng C Y, Lee L S. MAT-2000- Design, collection, and validation of a mandarin 2000-speaker telephone speech database//Proceedings of the 6th International Conference on Spoken Language Processing. Beijing, 2000.
  • 10Tseng Chiu-Yu, Cheng Yun-Ching, Chang Chun-Hsiang. Siniea COSPRO and toolkit--Corpora and platform of mandarin Chinese fluent speech//Proceedings of the Oriental COCOSDA 2005. Jakarta, Indonesia, 2005.

共引文献12

同被引文献28

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部