摘要
针对语音端点检测的特征参数算法计算复杂、难以适用于硬件平台等问题,对传统的能量统计复杂度算法进行优化。经过预处理的语音数据快速傅里叶变换后只取正频率部分,通过前半帧的频率分量计算谱线能量与其对数值,得到复杂度,省略了概率密度的计算。改进后的能量统计复杂度算法能够逐帧地对语音信号进行流水线计算,具有运算量少、数据处理效率高、适合硬件操作的特点。语音端点检测系统通过单参数双门限端点检测判断,采用EP4CE22E22C8型号的FPGA实现。试验结果表明,系统在高信噪比与低信噪比环境中均有较好的检测效果,语音起点检测的滞后时间为96ms,实时性好。
In order to solve the problems of characteristic parameter algorithm of voice activity detection(VAD) such as high computational complexity and difficulty of application to hardware platform, the traditional energy statistics complexity algorithm is optimized. After the preprocessed speech data were transformed by fast fourier transform, only the positive frequency part was taken. The spectrum energy and its logarithm were calculated through the frequency component of the first half frame. It can get the value of complexity and omit the calculation of probability density. This improved algorithm can calculate speech signals by pipelining frame by frame. It has the characteristics of less computation and high data processing efficiency and is suitable for hardware operation. Speech endpoint detection system was judged by single parameter double threshold endpoint detection and implemented by the EP4CE22E22C8 model of FPGA. The experimental results show as follow: The system can effectively detect voice activity in high and low signal-to-noise ratio environments. The delay time of voice start detection is 96ms. The system has good real-time performance.
作者
郭来功
陈松
GUO Laigong;CHEN Song(College of Electrical and Information Engineering,Anhui University of Science and Technology,Huainan 232001,China)
出处
《电视技术》
2019年第2期56-60,110,共6页
Video Engineering
关键词
语音端点检测
能量统计复杂度算法
流水线计算
双门限端点检测
FPGA
voice activity detection
energy statistics complexity algorithm
pipelining calculation
double threshold endpoint detection
FPGA