摘要
针对传统的硬决策树藏语音合成系统存在泛化性能不强的问题,设计改进一种二进制软决策树算法,实现基于语境因子的藏语音合成模型参数估计。内部节点根据子代节点隶属度进行选取,每个节点可视为基于语境依赖隶属度的模糊集合,将每个语境分配给几个重叠的叶节点,提高模型概括和函数逼近性能;采用最大熵平滑分布进行局部一阶矩和全局二阶矩特征捕捉,实现隐式马尔可夫(HMM)输出概率分布的软决策参数最大似然估计。仿真验证结果表明,所提算法在满足应用实时性要求的前提下,可有效提高藏语音合成效果。
For the poor generalization performance of traditional hard decision tree Tibet speech synthesis system, a binary soft decision tree algorithm for Tibet voice synthesis was designed, which used the contextual factors to estimate the model parame- ters. According to the membership of internal node, the descendant nodes were selected, each node was considered as the con- text-dependent membership fuzzy set, which assigned each context to several overlapping leaf nodes, thereby improving model generalization and function approximation performance. The maximum entropy smooth distribution was used to capture the local first moment and global second order moments, which realized the maximum likelihood estimation of decision parameters of HMM output probability distribution. Results of simulation show that the proposed algorithm meets the real-time requirements and effectively improves the Tibet speech synthesis effects.
出处
《计算机工程与设计》
北大核心
2017年第4期981-988,共8页
Computer Engineering and Design
基金
教育部人文社会科学研究青年基金项目(15YJC740063)
教育部人文社会科学研究西藏基金项目(15XZJCZH001)
西藏大学青年科研培育基金项目(ZDPJZK1505)
西藏大学珠峰学者人才发展支持计划主体计划"杰出青年学者"的自助
国家社会科学研究重大项目"基于地理信息平台的藏语方言数据库建设"(14ZDB101)之子课题"藏语方言时空数据库建设"
关键词
软决策树
藏语音合成
隐式马尔可夫
最大熵
隶属度
soft decision tree
Tibet speech synthesis
hidden Markov
maximum entropy
membership