摘要
普通话是有调语言,基频曲线是中文文语转换系统中选择单元时一个非常重要的参数。目前,许多文语转换系统都是通过人工的方法得到一些基频曲线的变化规则,并通过应用这些规则来指导单元的选择。然而,一方面由于基频曲线的变化十分复杂,人工总结的规则十分有限,另一方面规则只能是定性的,这也造成了单元选择时的不确定性。为了能更好地根据基频这个声学参数来选择语音单元,我们就必须建立文本上下文环境信息与基频曲线之间的映射关系,即基频模型,以定量地描述基频曲线的变化规律。本文从统计学的角度出发,通过决策树的方法来建立这个模型,并将这个模型应用到普通话的文语转换系统中。实验结果表明,通过本文提出的方法建立起来的基频模型基本上反映了基频曲线的变化规律,在语音合成中取得比较理想的效果。
Mandarin is a language with five tones. Pitch is a very important para meter for unit-selection in Mandarin Text-to-Speech system. At present, the obse rvations of pitch contour’s variation rules, which are guidance of unit-selecti on, are manual in many Text-to-Speech system. However, on the one hand the quant ity of rules got manually are very limited because the complexity of the pitch c ontour’s variation. On the other hand, the unit-selection that uses the rules w ill become uncertain because the rules are only qualitative analysis. We should find the mapping, which will be called pitch-model, between the context and the pitch contour to improve unit-selection by pitch. This model should describe the variation of pitch contour quantitatively. In this paper, a method by using dec ision tree will be proposed to generate this model which will be applied in mand arin Text-to-Speech system. The experiment result show that the model generated by the method that presented in this paper can reflect the variation of pitch co ntour approximately and work well in the mandarin Text-to-Speech system.
出处
《微电子学与计算机》
CSCD
北大核心
2004年第8期39-42,共4页
Microelectronics & Computer
关键词
基频曲线
文语转换
聚类分析
决策树
单元选择
Pitch contour,Text-to-Speech,Clustering analysis,Decision tree,Un it selection