摘要
针对复杂环境下语音端点检测准确率低且检测耗时过长的问题,提出一种基于总体平均经验模态分解(ensemble empirical mode decomposition,EEMD)和一步式字典学习(one-stage dictionary learning,OS-DL)联合去噪的语音端点检测算法。首先利用EEMD算法对输入语音进行分解得到本征模式分量(intrinsic mode function,IMF),然后使用OS-DL算法分别对纯净语音信号与噪声信号进行训练,得到纯净语音信号和噪声信号的幅度谱字典,进而对幅度谱进行稀疏表示,利用得到的系数矩阵重新构建出语音信号频谱,将重构出的语音信号频谱经过傅里叶逆变换得到降噪后的语音信号,最后对降噪后的语音信号利用均匀子带频带方差法进行端点检测。实验结果表明,该算法在复杂环境信噪比低于-10 dB情况下检测准确率仍可达到85%以上,且平均检测时间缩短至传统端点检测算法的1/3。
Aiming at the problem of low accuracy and long time-consuming detection of voice endpoints in complex environments,a speech endpoint detection algorithm based on ensemble empirical mode decomposition(EEMD)and one-stage dictionary learning(OS-DL)joint denoising was proposed.The input speech was first decomposed by EEMD to obtain intrinsic mode function(IMF),and the OS-DL algorithm was then used to train pure speech signals and noise signals separately.The dictionary of amplitude spectrum of pure speech signal and noise was obtained.Then the amplitude spectrum was sparsely represented.The obtained coefficient matrix was used to reconstruct the spectrum of the speech signal.The reconstructed speech signal spectrum was subjected to inverse Fourier transform to obtain a noise-reduced speech signal.Finally,the uniform subband frequency band variance method was used to detect the speech signal after noise reduction.Experimental results show that the detection accuracy of the algorithm can reach over 85%when the signal-to-noise ratio of the complex environment is below-10 dB,and the average detection time is shortened to 1/3 of the traditional endpoint detection algorithm.
作者
张开生
赵小芬
王泽
宋帆
ZHANG Kai-sheng;ZHAO Xiao-fen;WANG Ze;SONG Fan(College of Electrical and Control Engineering,Shaanxi University of Science and Technology,Xi an 710021,China)
出处
《科学技术与工程》
北大核心
2020年第35期14536-14542,共7页
Science Technology and Engineering
基金
陕西省科技计划(2017GY-063,2017ZDXM-SF-035)
陕西省教育厅专项科研计划(16JK1100)。