基于最小生成误差的HMM模型聚类自动优化被引量：1

Minimum Generation Error Based Optimization of HMM Model Clustering for Speech Synthesis

导出

摘要为改善决策树聚类的效果,避免可能出现的聚类模型过训练或欠训练的情况,提出一种基于最小生成误差以及通过交叉验证优化最小描述距离(MDL)因子选取的方法.文中通过计算交叉验证中的生成误差选择MDL因子,从而优化决策树大小.实验结果表明,此方法相对传统的固定MDL门限设定方法,更有效提升合成语音的音质和自然度. To improve the decision tree clustering and avoid possible clustered model over-training and less-training,a minimal generation error criterion and cross-validation（CV） based minimal description length factor optimizing method is introduced.CV based generation error is calculated to optimize the scale of the decision tree.Results of both subjective and objective tests show that synthesized speech by the proposed method outperforms the synthesized speech by the baseline one system in both quality and naturalness.

作者卢恒凌震华雷鸣戴礼荣王仁华

机构地区中国科学技术大学电子工程与信息科学系讯飞语音实验室

出处《模式识别与人工智能》 EI CSCD 北大核心 2010年第6期822-828,共7页 Pattern Recognition and Artificial Intelligence

关键词隐马尔可夫模型(HMM) 语音合成决策树聚类最小描述距离(MDL) 交叉验证(CV) Hidden Markov Model（HMM） Speech Synthesis Decision Tree Clustering Minimal Description Length（MDL） Cross-Validation（CV）

分类号 TN912.33 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献11

1Hunt A J,Black A W.Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Atlanta,USA,1996:373-376.
2Tokuda K,Yoshimura T,Masuko T,et al.Speech Parameter Generation Algorithms for HMM-Based Speech Synthesis // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Istanbul,Turkey,2000,Ⅲ:1315-1318.
3Yoshimura T,Tokuda K,Masuko T,et al.Simultaneous Modeling of Spectrum,Pitch and Duration in HMM-Based Speech Synthesis // Proc of the 6th European Conference on Speech Communication and Technology.Budapest,Hungary,1999,Ⅴ:2347-2350.
4Tokuda K,Masuko T,Miyazaki N,et al.Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Phoenix,USA,1999:229-232.
5Shinoda K,Watanabe T.MDL-Based Context-Dependent Subword Modeling for Speech Recognition.Acoustical Science and Technology,2000,21(2):79-86.
6吴义坚,王仁华.基于HMM的可训练中文语音合成[J].中文信息学报,2006,20(4):75-81. 被引量：17
7Wu Yijian,Wang Renhua.Minimum Generation Error Training for HMM-Based Speech Synthesis // Proc of the IEEE International Conference on Acoustics,Speech and Signal Process.Toulouse,France,2006:89-92.
8Kawahara H,Masuda-Katsuse I,de Chveigné A.Restructuring Speech Representations Using a Pitch-Adaptive Time Frequency Smoothing and a Instantaneous-Frequency-Based F0 Extraction:Possible Role of a Repetitive Structure in Sounds.Speech Communication,1999,27(3/4):187-207.
9Tokuda K,Zen H,Yamagishi J,et al.The HMM-Based Speech Synthesis System (HTS)[EB/OL].[2009-06-01] http://hts.sp.nitech.ac.jp/.
10Laroia R,Phamdo N,Farvardin N.Robust and Efficient Quantization of Speech LSP Parameters Using Structured Vector Quantizers // Proc of the International Conference on Acoustics,Speech and Signal Processing.Toronto,Canada,1991:641-644.

二级参考文献12

1R.H.Wang,Qingfeng Liu,Deyu Xia,Towards A Chinese Text-To-Speech System With Higher Naturalness[A],In:Proc.of ICSLP[C].Sydney,1998,p2047-2050.
2R.H.Wang,Zhongke Ma,Wei Li,Donglai Zhu,A Corpus-Based Chinese Speech Synthesis with ContextualDependent Unit Selection[A].In:Proc.of ICSLP[C].Beijing,2000,p391 -394.
3L.R.Rabiner,A tutorial on hidden Markov models and selected applications in speech recognition.Proc.of IEEE,1989[J].vol.77,pp.257-286.
4R.E.Donovan and E.M.Eide,The IBM trainable speech synthesis system[A].In:Proc.of ICSLP[C].Sydney,1998,vol.5,pp.1703-1706.
5X.Huang,A.Acero,H.Hon,Y.Ju,J.Liu,S.Merdith,and M.Plumpe,Recent improvements on Microsoft's trainable text-to-speech system-Whistler[A].In:Proc.of ICASSP[C].Munich,1997,pp.959-962.
6T.masuko,K.Tokuda,T.Kobayashi,and S.Imai,Speech synthesis from HMMs using dynamic features[A].In:Proc.of ICASSP[C].Atlanta,1996,pp.389 -392.
7T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi,and T.Kitamura,Simultaneous modeling of spectrum,pitch and duration in HMM-based speech synthesis[A].In:Proc.of Eurospeech[C].Budapest,1999,vol.5,pp.2347-2350.
8K.Tokuda,T.Masuko,N.Miyazaki,and T.Kobayashi,Hidden Markov models based on multi-space probability distribution for pitch pattern modeling.In:Proc.of ICASSP[C].Arizona,1999,pp.229-232.
9T.Yoshimura,K.Tokuda,T.Masuko,T.Kobayashi and T.Kitamura,Duration modeling in HMM-based speech synthesis system[A].In:Proc.of ICSLP[C].Sydney,1998,vol.2,pp.29-32.
10H.Kawahara,I.Masuda-Katsuse and A.deCheveigne,Restructuring speech representations using pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based FO extraction:possible role of a repetitive structure in sounds,Speech Communication[J].1999,vol.27,pp.187-207.

共引文献16

1凌震华,王仁华.基于统计声学模型的单元挑选语音合成算法[J].模式识别与人工智能,2008,21(3):280-284. 被引量：8
2王仁华,戴礼荣,凌震华,胡郁.基于统计建模的可训练单元挑选语音合成方法[J].科学通报,2009,54(8):1133-1138. 被引量：4
3姑丽加玛丽.麦麦提艾力,艾斯卡尔.肉孜,艾斯卡尔.艾木都拉.三音素模型的维吾尔语最佳文本选取算法[J].计算机工程与应用,2009,45(18):242-244. 被引量：5
4吕浩音.可训练文语转换系统的时长模型优化[J].计算机应用,2010,30(1):282-284. 被引量：2
5涂奇雄,梁维谦.基于HMM的语音合成系统的模型压缩[J].电声技术,2010,34(7):48-51. 被引量：1
6雷鸣,凌震华,戴礼荣.基于感知加权线谱对距离的最小生成误差语音合成模型训练方法[J].模式识别与人工智能,2010,23(4):572-579.
7杨丽萍,陈明义,刘玉芳.基于语音合成的分布式大坝安全预警系统设计[J].仪表技术与传感器,2010(7):57-58.
8陈雁翔,龙润田.基于PAD情感模型的可训练语音合成研究[J].模式识别与人工智能,2013,26(11):1019-1025. 被引量：1
9赵建东,高光来,飞龙.基于HMM的蒙古语语音合成技术研究[J].计算机科学,2014,41(1):80-82. 被引量：6
10郑莹,陈明.自然语言处理下的语音形式化研究[J].湖北科技学院学报,2014,34(12):123-124.

同被引文献18

1李伟红,刘丽娟,龚卫国,辜小花.人脸识别中基于均匀设计的SVM超参数调节方法[J].光电子．激光,2009,20(10):1342-1347. 被引量：3
2李春香,张为民,钟碧良.最小二乘支持向量机的参数优化算法研究[J].杭州电子科技大学学报（自然科学版）,2010,30(4):213-216. 被引量：9
3W.H. Tang, Q.H. Wu. Condition Monitoring and Assessment of Power Transformers Using Computational Intelligence [M]. New Y(rk: Springer-Verlag Press, 2011, 95-104.
4Vapnik V N. The nature of statistical learning theory [M]. New York: Springer-Verlag, 1995, 181-218.
5Nello Cristianini, John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods [J]. New York: Cambridge University Press, 2000, 93-124.
6Stone M. Cross-validalory choice and assessment of statistical predictions [J]. Journal of the Royal Statistical Society, 1974, 56(2): 111-147.
7F. Leisch, L.C. Jain, K. Hornik. Cross-validation with active pattern seleetion for neural network classifiers [J]. IEEE Transaction on Neural Network, 1998, 9(1), 35-41.
8Michael Affenzeller, Stephan Winkler, Stefm Wagner, Andreas Beham. Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications[M]. New York: CRC, 2009, 1-22.
9Duan K, Keerthi S, Poo A. Evaluation of simple performance measures for tuning SVM hyperparameters[J]. N eurocomputing, 2003, 51: 41-59.
10Chalimourda A, Seholkopf B, Smola A. Experimentally optimal v in support vector regression for different noise models and parameter settings[J]. Neural Networks, 2004, 17: 127-141.

引证文献1

1尹金良,朱永利.支持向量机参数优化及其在变压器故障诊断中的应用[J].电测与仪表,2012,49(5):11-16. 被引量：32

二级引证文献32

1王子骏,张峰,张士文,顾昊英,曹潘亮.基于支持向量机的低压串联故障电弧识别方法研究[J].电测与仪表,2013,50(4):22-26. 被引量：17
2周鑫,杨国华,朱向芬,陈琳,丁晓花,周世文,肖龙.欧氏距离二叉树向量机的变压器故障诊断研究[J].电测与仪表,2013,50(6):1-3. 被引量：6
3张利伟,苑津莎.基于典型样本和证据理论的变压器故障诊断[J].电测与仪表,2013,50(8):14-19. 被引量：4
4遇炳杰,朱永利.加权极限学习机在变压器故障诊断中的应用[J].计算机工程与设计,2013,34(12):4340-4344. 被引量：22
5苑津莎,张利伟,王瑜,尚海昆.基于极限学习机的变压器故障诊断方法研究[J].电测与仪表,2013,50(12):21-26. 被引量：63
6李育恒,赵峰.支持向量机在变压器故障诊断中的应用[J].科技创新与应用,2014,4(16):46-47. 被引量：5
7章欣,阿辽沙.叶,刘宣,董俐君,祝恩国,唐悦.用电信息采集系统现场诊断技术研究[J].电测与仪表,2014,51(8):116-119. 被引量：10
8翟旭,屈宝存.基于NGA优化支持向量机的电力变压器故障诊断[J].电子设计工程,2014,22(8):165-168. 被引量：3
9李育恒,赵峰.基于RFE-SA-SVM的变压器故障诊断[J].电测与仪表,2014,51(12):50-55. 被引量：2
10李秀广,吴旭涛,韩四满,刘世涛.一起750kV变压器放电故障的定位与分析[J].电测与仪表,2014,51(24):28-31. 被引量：3

1张金成,彭华.一种鲁棒的基于子空间分解的盲信噪比估计方法[J].数据采集与处理,2011,26(5):609-614. 被引量：17
2朱世磊,任丙印,王大鸣,仵国锋.一种基于子空间分解的认知MIMO传输机制[J].计算机工程,2012,38(20):1-4.
3石文斌,王大鸣,崔维嘉,仵国锋.改进白化的MDL快速独立分量分析算法[J].太赫兹科学与电子信息学报,2013,11(1):146-151.
4齐耀辉,潘复平,葛凤培,颜永红.汉语连续语音识别系统中三音子模型的优化[J].计算机应用研究,2013,30(10):2920-2922. 被引量：4
5张必武,冯穗力.对含脉冲噪声的图像去噪算法的研究[J].电视技术,2011,35(19):17-19. 被引量：2
6王欢良,韩纪庆,郑贵滨.基于K-L散度模型聚类的快速说话人辨识方法[J].模式识别与人工智能,2010,23(6):856-861. 被引量：5
7徐英进,贾珈,蔡莲红.汉语语音合成中说话人自适应的时长优化[J].清华大学学报（自然科学版）,2013,53(11):1597-1600. 被引量：1
8李居朋,谈振辉,陶成,徐少毅.基于多特征参数模型聚类的动态频谱接入策略[J].吉林大学学报（工学版）,2012,42(4):1021-1026.
9韩兆兵,贾磊,张树武,徐波.连续语音识别中声学建模的组合聚类算法研究[J].中文信息学报,2003,17(4):33-38. 被引量：5
10徐英进,王永鑫,蔡莲红.汉语语音合成中基于语境特征的清浊音时长调整[J].中国科技论文,2012,7(10):783-786.

模式识别与人工智能

2010年第6期

浏览历史

内容加载中请稍等...

基于最小生成误差的HMM模型聚类自动优化被引量：1

参考文献11

二级参考文献12

共引文献16

同被引文献18

引证文献1

二级引证文献32

相关作者

相关机构

相关主题

浏览历史

基于最小生成误差的HMM模型聚类自动优化 被引量：1

参考文献11

二级参考文献12

共引文献16

同被引文献18

引证文献1

二级引证文献32

相关作者

相关机构

相关主题

浏览历史

基于最小生成误差的HMM模型聚类自动优化被引量：1