期刊文献+

最大熵软决策树HMM最大似然藏语音合成

SDT-HMM:Hidden Markov based soft decision tree with maximum entropy for maximum likelihood Tibet speech synthesis
下载PDF
导出
摘要 针对传统的硬决策树藏语音合成系统存在泛化性能不强的问题,设计改进一种二进制软决策树算法,实现基于语境因子的藏语音合成模型参数估计。内部节点根据子代节点隶属度进行选取,每个节点可视为基于语境依赖隶属度的模糊集合,将每个语境分配给几个重叠的叶节点,提高模型概括和函数逼近性能;采用最大熵平滑分布进行局部一阶矩和全局二阶矩特征捕捉,实现隐式马尔可夫(HMM)输出概率分布的软决策参数最大似然估计。仿真验证结果表明,所提算法在满足应用实时性要求的前提下,可有效提高藏语音合成效果。 For the poor generalization performance of traditional hard decision tree Tibet speech synthesis system, a binary soft decision tree algorithm for Tibet voice synthesis was designed, which used the contextual factors to estimate the model parame- ters. According to the membership of internal node, the descendant nodes were selected, each node was considered as the con- text-dependent membership fuzzy set, which assigned each context to several overlapping leaf nodes, thereby improving model generalization and function approximation performance. The maximum entropy smooth distribution was used to capture the local first moment and global second order moments, which realized the maximum likelihood estimation of decision parameters of HMM output probability distribution. Results of simulation show that the proposed algorithm meets the real-time requirements and effectively improves the Tibet speech synthesis effects.
出处 《计算机工程与设计》 北大核心 2017年第4期981-988,共8页 Computer Engineering and Design
基金 教育部人文社会科学研究青年基金项目(15YJC740063) 教育部人文社会科学研究西藏基金项目(15XZJCZH001) 西藏大学青年科研培育基金项目(ZDPJZK1505) 西藏大学珠峰学者人才发展支持计划主体计划"杰出青年学者"的自助 国家社会科学研究重大项目"基于地理信息平台的藏语方言数据库建设"(14ZDB101)之子课题"藏语方言时空数据库建设"
关键词 软决策树 藏语音合成 隐式马尔可夫 最大熵 隶属度 soft decision tree Tibet speech synthesis hidden Markov maximum entropy membership
  • 相关文献

参考文献2

二级参考文献10

  • 1邵艳秋,韩纪庆,王卓然,刘挺.韵律参数和频谱包络修改相结合的情感语音合成技术研究[J].信号处理,2007,23(4):526-530. 被引量:7
  • 2Traber C. From Multilingual to Polyglot Speech Synthe- sis[C]//Proc, of Eurospeech. Budapest, Hungary: Is. n.], 1999.
  • 3Qian Yao, Liang Hui, Soong F K. A Cross-language State Sharing and Mapping Approach to Bilingual(Mandarin- English) TTS[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(6): 1231-1239.
  • 4So Yongjin, Jia Jia, Wang Yongxin, et al. Label Transform Based Cross-language Speaker Adaptation in Bilingual (Mandarin-English) TTS[C]//Proc. of International Confe- rence on Audio, Language and Image Processing. Shanghai, China: [s. n.], 2012.
  • 5Zhang Yi, Tao Jianhua. Prosody Modification on Mixed- language Speech Synthesis[C]//Proc. of ISCSLP'08. Kunming, China: [s. n.], 2008.
  • 6Qian Yao, Cao Houwei, Soong F K. HMM-based Mixed- language(Mandarin-English) Speech Synthesis[C]//Proc. oflSCSLP'08. Kunming, China: Is. n.], 2008.
  • 7Yin Bo, Ambikairajah E, Chen Fang. Combining Cepstral and Prosodic Features in Language Identification[C]//Proc. ofICPR'06. Hong Kong, China: Is. n.], 2006.
  • 8Zhang Jialu. A SAMPA System for Putonghua (Standard Chinese)[C]//Proc. of Oriental COCOSDA'99. Taipei, China: [s. n.], 1999.
  • 9Zu Yiqing, Chert Yingzhi, Zhang Yaxin, et al. A Super Phonetic System and Multi-dialect Chinese Speech Corpus for Speech Recognition[C]//Proc. of ISCSLP'06. Singapore: [s. n.], 2006.
  • 10张家騄.汉语普通话机读音标SAMPA-SC[J].声学学报,2009,34(1):81-86. 被引量:9

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部