期刊文献+

基于最小生成误差的HMM模型聚类自动优化 被引量:1

Minimum Generation Error Based Optimization of HMM Model Clustering for Speech Synthesis
原文传递
导出
摘要 为改善决策树聚类的效果,避免可能出现的聚类模型过训练或欠训练的情况,提出一种基于最小生成误差以及通过交叉验证优化最小描述距离(MDL)因子选取的方法.文中通过计算交叉验证中的生成误差选择MDL因子,从而优化决策树大小.实验结果表明,此方法相对传统的固定MDL门限设定方法,更有效提升合成语音的音质和自然度. To improve the decision tree clustering and avoid possible clustered model over-training and less-training,a minimal generation error criterion and cross-validation(CV) based minimal description length factor optimizing method is introduced.CV based generation error is calculated to optimize the scale of the decision tree.Results of both subjective and objective tests show that synthesized speech by the proposed method outperforms the synthesized speech by the baseline one system in both quality and naturalness.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2010年第6期822-828,共7页 Pattern Recognition and Artificial Intelligence
关键词 隐马尔可夫模型(HMM) 语音合成 决策树聚类 最小描述距离(MDL) 交叉验证(CV) Hidden Markov Model(HMM) Speech Synthesis Decision Tree Clustering Minimal Description Length(MDL) Cross-Validation(CV)
  • 相关文献

参考文献11

  • 1Hunt A J,Black A W.Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Atlanta,USA,1996:373-376.
  • 2Tokuda K,Yoshimura T,Masuko T,et al.Speech Parameter Generation Algorithms for HMM-Based Speech Synthesis // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Istanbul,Turkey,2000,Ⅲ:1315-1318.
  • 3Yoshimura T,Tokuda K,Masuko T,et al.Simultaneous Modeling of Spectrum,Pitch and Duration in HMM-Based Speech Synthesis // Proc of the 6th European Conference on Speech Communication and Technology.Budapest,Hungary,1999,Ⅴ:2347-2350.
  • 4Tokuda K,Masuko T,Miyazaki N,et al.Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Phoenix,USA,1999:229-232.
  • 5Shinoda K,Watanabe T.MDL-Based Context-Dependent Subword Modeling for Speech Recognition.Acoustical Science and Technology,2000,21(2):79-86.
  • 6吴义坚,王仁华.基于HMM的可训练中文语音合成[J].中文信息学报,2006,20(4):75-81. 被引量:17
  • 7Wu Yijian,Wang Renhua.Minimum Generation Error Training for HMM-Based Speech Synthesis // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Toulouse,France,2006:89-92.
  • 8Kawahara H,Masuda-Katsuse I,de Chveigné A.Restructuring Speech Representations Using a Pitch-Adaptive Time Frequency Smoothing and a Instantaneous-Frequency-Based F0 Extraction:Possible Role of a Repetitive Structure in Sounds.Speech Communication,1999,27(3/4):187-207.
  • 9Tokuda K,Zen H,Yamagishi J,et al.The HMM-Based Speech Synthesis System (HTS)[EB/OL].[2009-06-01] http://hts.sp.nitech.ac.jp/.
  • 10Laroia R,Phamdo N,Farvardin N.Robust and Efficient Quantization of Speech LSP Parameters Using Structured Vector Quantizers // Proc of the International Conference on Acoustics,Speech and Signal Processing.Toronto,Canada,1991:641-644.

二级参考文献12

  • 1R.H.Wang,Qingfeng Liu,Deyu Xia,Towards A Chinese Text-To-Speech System With Higher Naturalness[A],In:Proc.of ICSLP[C].Sydney,1998,p2047-2050.
  • 2R.H.Wang,Zhongke Ma,Wei Li,Donglai Zhu,A Corpus-Based Chinese Speech Synthesis with ContextualDependent Unit Selection[A].In:Proc.of ICSLP[C].Beijing,2000,p391 -394.
  • 3L.R.Rabiner,A tutorial on hidden Markov models and selected applications in speech recognition.Proc.of IEEE,1989[J].vol.77,pp.257-286.
  • 4R.E.Donovan and E.M.Eide,The IBM trainable speech synthesis system[A].In:Proc.of ICSLP[C].Sydney,1998,vol.5,pp.1703-1706.
  • 5X.Huang,A.Acero,H.Hon,Y.Ju,J.Liu,S.Merdith,and M.Plumpe,Recent improvements on Microsoft's trainable text-to-speech system-Whistler[A].In:Proc.of ICASSP[C].Munich,1997,pp.959-962.
  • 6T.masuko,K.Tokuda,T.Kobayashi,and S.Imai,Speech synthesis from HMMs using dynamic features[A].In:Proc.of ICASSP[C].Atlanta,1996,pp.389 -392.
  • 7T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,and T.Kitamura,Simultaneous modeling of spectrum,pitch and duration in HMM-based speech synthesis[A].In:Proc.of Eurospeech[C].Budapest,1999,vol.5,pp.2347-2350.
  • 8K.Tokuda,T.Masuko,N.Miyazaki,and T.Kobayashi,Hidden Markov models based on multi-space probability distribution for pitch pattern modeling.In:Proc.of ICASSP[C].Arizona,1999,pp.229-232.
  • 9T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi and T.Kitamura,Duration modeling in HMM-based speech synthesis system[A].In:Proc.of ICSLP[C].Sydney,1998,vol.2,pp.29-32.
  • 10H.Kawahara,I.Masuda-Katsuse and A.deCheveigne,Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based FO extraction:possible role of a repetitive structure in sounds,Speech Communication[J].1999,vol.27,pp.187-207.

共引文献16

同被引文献18

  • 1李伟红,刘丽娟,龚卫国,辜小花.人脸识别中基于均匀设计的SVM超参数调节方法[J].光电子.激光,2009,20(10):1342-1347. 被引量:3
  • 2李春香,张为民,钟碧良.最小二乘支持向量机的参数优化算法研究[J].杭州电子科技大学学报(自然科学版),2010,30(4):213-216. 被引量:9
  • 3W.H. Tang, Q.H. Wu. Condition Monitoring and Assessment of Power Transformers Using Computational Intelligence [M]. New Y(rk: Springer-Verlag Press, 2011, 95-104.
  • 4Vapnik V N. The nature of statistical learning theory [M]. New York: Springer-Verlag, 1995, 181-218.
  • 5Nello Cristianini, John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods [J]. New York: Cambridge University Press, 2000, 93-124.
  • 6Stone M. Cross-validalory choice and assessment of statistical predictions [J]. Journal of the Royal Statistical Society, 1974, 56(2): 111-147.
  • 7F. Leisch, L.C. Jain, K. Hornik. Cross-validation with active pattern seleetion for neural network classifiers [J]. IEEE Transaction on Neural Network, 1998, 9(1), 35-41.
  • 8Michael Affenzeller, Stephan Winkler, Stefm Wagner, Andreas Beham. Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications[M]. New York: CRC, 2009, 1-22.
  • 9Duan K, Keerthi S, Poo A. Evaluation of simple performance measures for tuning SVM hyperparameters[J]. N eurocomputing, 2003, 51: 41-59.
  • 10Chalimourda A, Seholkopf B, Smola A. Experimentally optimal v in support vector regression for different noise models and parameter settings[J]. Neural Networks, 2004, 17: 127-141.

引证文献1

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部