摘要
语音清晰度在通信终端、设备系统语音识别方面具有重要意义。本文对110dB噪声干扰下采集到的语音信号进行谱减法降噪,双门限端点检测提取发音字段,然后提取梅尔频率倒谱系数(MFCC),再将其进行差分计算,得到一阶和二阶分量,结合短时能量作为语音信号的特征参数,最后通过动态时间归整(DTW)进行相似度识别。实验表明,本文算法对汉语清晰度诊断押韵测试(DRT)字表的测试结果高达92.90%,有良好的识别率。
Speech articulation plays an important role in speech recognition of communication terminals and equipment systems.In this paper,under the interference of 110 dB noise,the collected speech signal is denoised by spectral subtraction,the pronunciation field is extracted by double-threshold endpoint detection,and then the Mel Frequency Cepstral Coefficients(MFCC)is extracted,and the difference calculation is carried out to obtain the first-order and second-order components,and the short-time energy is used as the characteristic parameter of the speech signal.Finally,Dynamic Time Warping(DTW)is used for similarity recognition.The experimental results show that the algorithm has a high recognition rate of 92.90%for Chinese articulation Diagnostic Rhyme Test(DRT).
作者
马成龙
焦俊清
焦富清
王杰
陈巧特
谢武俊
李军
Ma Chenglong;Jiao Junqing;Jiao Fuqing;Wang Jie;Chen Qiaote;Xie Wujun;Li Jun(Wuhan Patron Data Technology Co.,Ltd,Wuhan 430205,China;Hongyu Life Saving Equipment Co.,Ltd,Xiangyang 441058,China)
出处
《信息化研究》
2024年第2期63-68,共6页
INFORMATIZATION RESEARCH
关键词
语音清晰度
谱减法
端点检测
梅尔频率倒谱系数
动态时间归整
汉语清晰度诊断押韵测试
speech articulation
spectral subtraction
endpoint detection
Mel Frequency Cepstral Coefficients
Dynamic Time Warping
Diagnostic Rhyme Test