摘要
基频轮廓就是基频随时间变化的曲线,刻画了汉语中最主要的韵律特征—声调和语调的变化趋势.而藤崎模型则是由日本东京大学藤崎博也教授建立的一种韵律模型,可以用来非常好地逼近汉语语音的基频轮廓(1).本文提出了一种基于该模型的从汉语单音节的基频轮廓中提取参数的方法.我们首先采用基于小波变换的基音检测技术(3)获取非常准确的给定单字的各基频值,并连接形成基频轮廓,然后根据最小均方误差准则,用藤崎模型来拟合各基频点,最后分析得出模型的最优化参数,作为此基频轮廓的参数.
The fundamental frequency contour(f0 contour), which manifests the most important prosody feature ——tone of the spoken Chinese, is the contour of fundamental frequency changing with time. The Fujisaki's Model is a prosodic model set up by H.Fujisaki from Science University of Tokyo in Japan, by which f0 contour of Chinese speech can be well approximated. In this paper we has present a method based on this model for extracting the parameter from f0 contour of the given Chinese syllable. We obtained f0 contour of the given syllable by the pitch extraction method based on the wavelet transform, then the optimal model parameters, which were obtained by minimizing the mean squared error between the extracted f0 contour and the model generated contour, is the parameters of the f0 contour. The validity of the proposed method has been confirmed experimentally.
出处
《小型微型计算机系统》
CSCD
北大核心
1999年第10期756-759,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金
关键词
藤崎模型
基频轮廓
汉语语音
参数提联
语音合成
Fujisaki's model f0 contour The wavelet transform Minimized mean squared error