摘要
波形相似叠加算法忽略语音本身感知特性,对整段语音统一规整,在采样率较低或规整比例较大时处理效果不佳。为此,通过分析人耳听觉系统的预测特点,提出一种改进的波形相似叠加时长规整算法。采用子带谱熵法检测出语音的转折部分并保持其不变,以保证转折区的语音信息不受损坏,并给出一种局部补偿法以修正整体规整精度。仿真结果表明,该算法在整体规整比例不变的情况下可提高合成语音的自然度。
The Waveform Similarity Overlap-and-Add(WSOLA)algorithm neglects the perceptual characteristics of real sound speech signals,and employs uniform time scaling of the entire signal.When sampling rate is low or scaling proportion is large,the scale quality is degraded.Aiming at such problems,an enhanced WSOLA algorithm is proposed through analyzing the acoustic prediction characteristics of human auditory system.This method detects the turning points of the speech using a subband spectrum entropy measure and leaves them intact to ensure the turning points undamaged,while time scaling the remainder of the signal.A local compensate measure is further put forward to correct the whole scale accuracy.Simulation results show that the new algorithm improves the natural degree of the synthetic speech signals with the whole scale proportion unchanged.
出处
《计算机工程》
CAS
CSCD
北大核心
2015年第10期260-264,共5页
Computer Engineering
基金
甘肃省科技厅自然科学基金资助项目(1310RJZA050)
关键词
时长规整算法
波形相似叠加算法
听觉预测
转折点检测
子带谱熵
局部补偿法
time warping algorithm
Waveform Similarity Overlap-and-Add(WSOLA)algorithm
acoustic prediction
turning point detection
subband spectrum entropy
local compensation method