基于发音机制的贪婪自适应语音时长规整算法

Greedy Adaptive Speech Time Scale Modification Algorithm Based on Pronunciation Mechanism

下载PDF

导出

摘要语音时长规整的同步叠加算法未考虑真实声音信号中不同类型语音帧受语速影响变化不同的特性,对所有语音帧都采用相同的规整因子,当规整比例过大时,导致输出语音失真。针对该问题,提出一种贪婪自适应算法。对不同类型语音段使用不同的规整因子,动态改变规整因子,进一步改进整体规整比例缺陷,从而设计贪婪自适应语音时长规整算法。在Matlab环境下对TIMIT语音库进行语音对比的结果表明,与波形相似同步叠加算法、时域基音同步叠加算法相比,该算法能提高合成语音的自然度,减小规整时长误差。 The Synchronized Overlap-add（SOLA） algorithm of speech Time Scale Modification（TSM） neglects the natural characteristics of real sound speech signals that different kinds of speech segments change differently under the change of speech speed and applies a same scaling factor to all the speech segments.When scaling proportion is large,the output speech signal is distorted.Aiming at such problems,a greedy adaptive algorithm is proposed.This algorithm applies different scaling factors to different speech segments and puts forward an adaptive algorithm.It changes the scaling factors dynamically,the defect of the whole modified proportion is further ameliorated and a greedy adaptive algorithm is created.Experimental results show that,under the Matlab environment,in the comparison simulations of speeches from TIMIT speech base,this algorithm improves the natural degree of the synthetic speech signals compared with the existing algorithms like Waveform Similarity Synchronized Overlap-add（WSOLA） algorithm and Time Domain Pitch Synchronized Overlap-add（TDPSOLA） algorithm.The scaled time deviation of the greedy adaptive algorithm is small.

作者杨燕雷颖思岳辉

机构地区兰州交通大学电子与信息工程学院兰州交通大学铁道技术学院

出处《计算机工程》 CAS CSCD 北大核心 2015年第8期212-217,共6页 Computer Engineering

基金甘肃省科技厅自然科学基金资助项目(1310RJZA050) 甘肃省高等学校基本科研业务费专项基金资助项目(214138)

关键词语音时长规整规整因子同步叠加算法自适应算法贪婪自适应算法 speech Time Scale Modification（TSM） modification factor Synchronized Overlap-add（SOLA） algorithm adaptive algorithm greedy adaptive algorithm

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献16

1Kupryjanow A,Czyzewski A. Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit[ EB/OL ]. ( 2010-11-21 ). http ://www. ncbi. nlm. nih. gov/pubmed/23009662.
2周俊,高悦,谭薇,陈砚圃.语音时长规整技术的研究回溯[J].现代电子技术,2006,29(18):102-105. 被引量：6
3Xiang Shijun, Kim H J, Huang Jiwu. Audio Water- marking Robust Against Time-scale Modification and MP3 Compression [ J]. Signal Processing, 2008,88 (10) : 2372-2387.
4Amatriain X, Bonada J L. Content-based Transforma- tions[ J]. Journal of New Music Research ,2003,32( 1 ) : 95-114.
5Kang J A,Choi S H. Speaking Rate Control Based onTime-scale Modification and Its Effects on the Performance of Speech Recognition [ J ]. International Journal of Engi- neering Systems Modelling and Simulation,2014,6 (1): 31-36.
6Chu W C,Lashkari K. Energy-based Nonuniform Time- scale Compression of Audio Signals [ J ]. IEEE Tran- sactions on Consumer Electronics, 2003,49 ( 1 ) : 183- 187.
7黄昊,郭立,郑东飞.分段语音时长规整算法[J].声学技术,2007,26(6):1191-1195. 被引量：4
8毛启容,詹永照,杜守富.一种快速实时语音个人特征改变方法[J].电子与信息学报,2007,29(2):434-438. 被引量：2
9Jeon K M, Kim H K. High-quality Speech Modification Based on Pitch-synchronous Harmonic and Non- harmonic Modeling of Speech [ J ]. Advanced Science and Technology Letters ,2012,14( 1 ) : 176-179.
10Hejna D J. Real-time Time-scale Modification of Speech via the Synchronized Overlap-add Algorithm[D]. Cambridge, USA: Massachusetts Institute of Technology, 1990.

二级参考文献60

1梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法[J].清华大学学报（自然科学版）,2005,45(1):5-8. 被引量：12
2杜守富,毛启容,詹永照.自适应同步叠加语音时长规整算法[J].通信学报,2005,26(2):136-140. 被引量：4
3郑玉玲,刘佳.普通话N1 C2(C#C)协同发音的声学模式[J].南京师范大学文学院学报,2005(3):150-157. 被引量：6
4Wong W, AU O(C). Fast SOI.A- based Time Scale Modification Using Modified Envelope Matching. Proc of IEEE International Conference on Acoustics,Speech and Signal Processing[C].Orlando,Fl..2002:3188-3191.
5Wong Hon Wah. Variable Speed Playback System for Speech and Audio Signals (and Topics in Video Processing). Master Thesis, MIT, 1998.
6赵胜辉．离散时问语音信号处理-原理与应用[M]．北京：电子工业出版社．2004．
7(D) Malah. Time- domain Algorithms for Harmonic Bandwidth Reduction and Time Scaling of Speech Signals [J].IEEE Trans. Acoust. , Speech, Signal Processing. 1979, ASSP- 27(12):121 - 133.
8Griffin D W, Lin J S. Signal Estimation from Modified Short -Time Fourier Transform[J]. IEEE Trans. Acoust. ,Speech, Signal Processing, 1984, ASSP - 32 ( 2 ) : 236 - 243.
9Roucos S,Wilgus A M. High Quality Time Scale Modification for Speech [J]. proc. IEEE int. Conf. Acoustics,Speech. , Signal Processing, 1985,1 : 493 - 496.
10Hejna D J. Real - Time Time - Scale Modification of Speech via the Synchronized Overlap - Add Algorithm. Master Thesis, 1990.

共引文献13

1谢贵武,丁竑,汤云革,张雄伟,杨继红.基于时长调整技术的低速率语音编码算法[J].军事通信技术,2010(1):51-55.
2谢贵武,杨继红,肖勇,闵刚.基于语音分段的自适应时长调整算法[J].军事通信技术,2008(2):56-59. 被引量：2
3卢竑,关英伟.北流白话单字调声学实验研究[J].声学技术,2008,27(6):867-872. 被引量：2
4莫双燕,关海欣,郑可欣.语音时长调整快速算法[J].声学技术,2010,29(5):507-511. 被引量：1
5张晓蕊,田岚.语音变调方法分析及音效评估[J].山东大学学报（工学版）,2011,41(1):1-6. 被引量：4
6汪成亮,张玉维.基于共振峰合成和韵律调整的语音验证码方法研究[J].计算机应用研究,2011,28(7):2458-2461. 被引量：4
7汪石农,许钢.改进相位声码器的音频时长变换算法研究[J].计算机工程与应用,2012,48(36):155-159.
8黄秋华,金雅声,郭丹丹,郭蕾.汉语普通话鼻音声学和生理研究[J].西北民族大学学报（自然科学版）,2015,36(1):37-42.
9雷颖思,杨燕.基于语音转折点检测的改进波形相似叠加时长规整算法[J].计算机工程,2015,41(10):260-264. 被引量：1
10杨雪祎.基于GMM的导游服务语音评分算法研究[J].科技风,2021(5):69-70. 被引量：1

1黄昊,郭立,李琳.基于感知敏感成分划分的语音时长规整算法[J].数据采集与处理,2008,23(6):740-745. 被引量：4
2周俊,王立强,李波,韩桃.基于改进二次谱基音检测的时长规整算法[J].无线电工程,2010,40(2):19-21.
3成勇.基于语音对比的IVR自动拨测系统设计与实现[J].中国新通信,2013,15(15):21-22.
4张志杰.面向PDA的语音对比系统设计[J].福建电脑,2008,24(3):139-140.
5王静波.播音主持语速影响因素分析[J].西部广播电视,2013,34(08X):94-95.
6毛启容,詹永照,杜守富.一种快速实时语音个人特征改变方法[J].电子与信息学报,2007,29(2):434-438. 被引量：2
7涂俊辉,续晋华.基于HTK的连续语音识别系统及其在TIMIT上的实验[J].现代计算机,2009,15(11):29-33. 被引量：6
8刘一蓉,张大元,陈雷,唐静波.交换机计费时长测量误差的深入研究[J].现代电信科技,2006(9):39-42.
9汪成亮,张玉维.基于共振峰合成和韵律调整的语音验证码方法研究[J].计算机应用研究,2011,28(7):2458-2461. 被引量：4
10陈继旭,刘明辉,戴蓓蒨,李辉.文本无关说话人确认中的一种新的评分规整方法[J].信号处理,2006,22(4):545-549. 被引量：1

计算机工程

2015年第8期

浏览历史

内容加载中请稍等...

基于发音机制的贪婪自适应语音时长规整算法

参考文献16

二级参考文献60

共引文献13

相关作者

相关机构

相关主题

浏览历史