摘要
语音端点检测在语音识别系统中占有重要地位。针对在噪声多变的环境中实时截取完整语音信号存在困难,文章提出一种实时语音端点检测方法。该方法首先提取每帧信号的短时平均过零率与Mel频率倒谱系数;然后利用前N帧背景噪声的Mel频率倒谱系数对当前帧进行归一化,并以该特征矢量的L2范数作为另一特征;最后根据多特征分析对有效语音信号进行截取。实验结果表明,该方法在多变的噪声环境中,截取完整语音信号具有较高准确率。
The speech endpoint detection plays an important role in speech recognition system. It is difficult to intercept the complete speech signal in real-time environment in noisy environment. This paper presents a real-time speech endpoint detection method. Firstly extracts the short-term average zero-crossing rate and Mel frequency cepstrum coefficient(MFCC)of each frame signal. Then, MFCC of the headmost N-frame background noise normalizes the current frame, a feature vector whose L2 norm as another feature. Finally, the effective speech signal was intercepted according to the multi-feature analysis. The experimental results show that the method has higher accuracy in intercepting the complete speech signal in the variable noise environment.
出处
《无线互联科技》
2017年第22期50-53,共4页
Wireless Internet Technology
基金
国家自然科学基金
项目编号:61422201
关键词
语音端点检测
MEL频率倒谱系数
短时平均过零率
多特征
speech endpoint detection
Mel frequency cepstrum coefficient
short-term average zero-crossing rate
multi-feature