基于时域单元融合的拼接平滑算法

A Smoothing Method for Voiced Units Concatenation Based on Time-Domain Unit Fusion

下载PDF

导出

摘要针对基于大语料库的拼接合成系统中经常出现的拼接单元不匹配问题,特别是浊音拼接处不匹配对合成效果会产生较大的损伤,本文提出一种基于时域单元融合技术的平滑算法。它通过模板匹配选取合适的过渡段模板作为融合单元,并同时进行相位对齐,然后采用TD-PSOLA的方法对拼接单元和融合单元进行时域上的基音同步迭加融合。它的优点是对音质损伤很小,而且直接在时域上进行,效率高。通过对平滑前后语谱及主观听感两个方面的对比评测,平滑后的效果比平滑前有明显改善。 The corpus-based concatenative speech synthesis methods have became popular for its high-quality speech. However, the quality of concatenated speech often suffers from discontinuities between the acoustic units, due to contexual differences and variations in speaking styles across the database, especially between the voiced units. In this paper, we proposed a smoothing method called time-domain unit fusion （TD-UF） to smooth the discontinuities between the voiced units. In the proposed method, the appropriate fusion unit, i.e. transition template, was obtained by periodic matching in time-domain, and then the fusion procedure was performed between the concatenated unit and fusion unit in time domain by TD-PSOLA. From the result of comparison in spectral and perceptive aspect between the smoothed and un-smoothed data, the method has distinct smoothing effect on speech quality and high efficiency due to the operation in time domain.

作者郭武吴义坚

机构地区中国科技大学电子工程与信息科学系讯飞语音实验室

出处《中文信息学报》 CSCD 北大核心 2006年第5期71-76,共6页 Journal of Chinese Information Processing

关键词计算机应用中文信息处理时域单元融合拼接单元融合单元 computer application Chinese information processing time-domain unit fusion concatenated unit fusion unit

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1吴禀雅,周昌乐,吴洁敏.汉语基调的调模与语音合成的质量提高[J].中文信息学报,2003,17(3):53-58. 被引量：3
2A.J.Hunt and A.W.Black,Unit selection in a concatenative speech synthesis system using a large speech database[A].Int.Conf.Acoustics,Speech,Signal Processing'96[C],1996,373-376.
3R.H.Wang,Qingfeng Liu,Deyu Xia,:Towards A Chinese Text-To-Speech System With Higher Naturalness[A].Proc.ICSLP98[C],2047-2050,Sydney,1998.
4R.H.Wang,Zhongke Ma,Wei Li,Donglai Zhu:A Corpus-Based Chinese Speech Synthesis with Contextual Dependent Unit Selection[A].Proc.Of ICSLP[C],p391-394,Beijing,2000.
5David T.Chappel and John H.L.Hanson,A comparison of Spectral Smoothing methods for segment concatenation based speech synthesis[J].Speech Communication,vol.36,no.3-4,43-374,March 2002.
6J.Wouters and M.W.Macon,Control of spectral dynamic in concatenative speech synthesis[J].IEEE Transactions on Speech and Audio Processing,vol.9,no.1,30 -38,2001.
7Y.Stylianou,Removing linear phase mismatches in concatenative speech synthesis[J].IEEE Transactions on Speech and Audio Processing,vol.9,no.3,March 2001.
8Moulines E.and Charpentier F.,Pitch-Synchronous Waveform Processing Techniques for text-to-speech Synthesis Using Diphones[J].Speech Communication,vol.9,453-467,1990.

二级参考文献2

1周强.规则和统计相结合的汉语词类标注方法[J].中文信息学报,1995,9(3):1-10. 被引量：43
2初敏.韵律研究与合成语音的自然度[A]..第五届现代语音学学术会议文集[C].,2001..

共引文献2

1周洁,赵力,邹采荣.情感语音合成的研究[J].电声技术,2005,29(10):57-59. 被引量：10
2赵建洋,胡泽雄.动态文本-语音编程系统的研究与应用[J].淮阴工学院学报,2007,16(3):36-39. 被引量：2

1曾令平,柴佩琪.基于不同长度拼接单元的英文文语转换系统[J].计算机工程,2005,31(3):180-182. 被引量：1
2才让卓玛,才智杰.基于语料库的藏语TTS技术研究[J].青海师范大学学报（自然科学版）,2010,26(2):66-69. 被引量：2
3韩东,杨震,许葆华.基于数据驱动的故障预测模型框架研究[J].计算机工程与设计,2013,34(3):1054-1058. 被引量：11
4唐承佩,冯国聪,吕嘉昕,倪江群.基于FPGA的多传感器融合单元的设计研究[J].传感技术学报,2007,20(12):2611-2615. 被引量：2
5柳会珍,杨位钦.多层并行决策融合的贝叶斯方法[J].北京理工大学学报,1998,18(5):536-540.
6周艳,艾斯卡尔.基于可变长音素序列拼接单元的维吾尔语语音合成技术研究[J].四川理工学院学报（自然科学版）,2007,20(2):64-68.
7陈嫣,何佳洲.多平台协同防空作战系统数据融合技术研究[J].舰船电子工程,2006,26(3):40-43. 被引量：14
8邵艳秋,韩纪庆,王东东,刘挺.基于基音同步的时频域插值的汉语语音合成[J].哈尔滨工业大学学报,2007,39(1):110-113.
9王明,肖熙.变帧长和变帧率在说话人确认中的应用[J].计算机应用,2007,27(8):2051-2052.
10裴定瑜,柴佩琪,曾令平.基于决策树CART选择拼接单元的英语语音合成[J].计算机工程,2006,32(3):223-225.

中文信息学报

2006年第5期

浏览历史

内容加载中请稍等...

基于时域单元融合的拼接平滑算法

参考文献8

二级参考文献2

共引文献2

相关作者

相关机构

相关主题

浏览历史