摘要
在英文语音合成系统中,韵律短语边界预测的精度对合成语音的自然度和可懂度有着至关重要的影响。基于决策树的预测方法是现阶段最为常用的韵律短语边界预测方法,但因决策树构建时受到数据平衡性制约,难以针对关键词进行建模,而且在基于决策树进行预测时采用了局部最优的搜索方式无法达到全局最优。所以,为了进一步提升韵律短语边界的预测效果,对基于决策树的预测方法进行了改进,引入韵律短语条件概率,使用Viterbi算法同时优化韵律短语边界概率和条件概率,并提出了基于关键词在韵律短语中的位置分布特性的决策树节点概率优化方法。实验表明,在基线系统上使用改进方法后,F-Score由68.7%提升到77.8%,而不可接受率从22.4%降低到15.2%。
In English speech synthesis systems, the accuracy of prosodic phrase boundary prediction has a critical influence on the naturalness and intelligibility of synthetic speech. Currently, decision tree based prediction is the most popular method for predicting the prosodic phrase boundaries. However, this method can' t build models for specific keywords because of the data balance issue. Besides, it wouldn' t be possible to achieve the global optimization by the local optimization search method at prediction stage. Therefore, in order to improve the prediction performance, this paper introduced the conditional probability of prosodic phrases, and used Viterbi algorithm to optimize the prosodic phrase boundary probability and conditional probability simultaneously. Furthermore, it proposed an optimization method for probability distribution of the decision tree nodes, based on location distribution characteristics of keywords in prosodic phrases. The experimental results show that F-Score of phrase boundary prediction increases from 68.7% to 77.8% and the non-acceptance rate drops from 22.4% to 15.2% after adopting the proposed method.
出处
《计算机应用研究》
CSCD
北大核心
2012年第8期2921-2925,共5页
Application Research of Computers
关键词
语音合成
韵律短语
边界预测
决策树
位置分布
speech synthesis
prosodic phrase
boundary prediction
decision tree
location distribution