摘要
针对传统病理语音识别效率低的问题,提出了一种利用卷积神经网络语音特征的病理语音识别方法,实现了特征的自动提取。从原始语音信号中提取梅尔语谱图特征,并对原始图像进行数据增强。基于迁移学习的思想,对Alex Net网络进行微调和训练,并将图像输入到训练好的卷积神经网络中提取语句级特征,输出时由时域金字塔匹配进行统一降维,得到相同长度的语音特征。使用神经网络和支持向量机分类器分别对提取好的语音特征进行分类,以完成病理语音识别。实验结果表明,神经网络能够很好地提取复杂和抽象的特征,避免了前期复杂繁琐的数据处理和数据分析工作,同时与传统特征提取方法相比准确率有所提高。
To address the problem of low efficiency of traditional pathological speech recognition,a pathological speech recognition method using convolutional neural network speech features is proposed to achieve automatic feature extraction.Merle spectrogram features are extracted from the original speech signal and data enhancement is performed on the original image.Based on the idea of migration learning,the Alex Net network is fine⁃tuned and trained,and the images are input to the trained convolutional neural network to extract utterance⁃level features,and the output is uniformly downscaled by time⁃domain pyramidal matching to obtain speech features of the same length.The extracted speech features are classified using a neural network and a support vector machine classifier,respectively,to complete the pathological speech recognition.The experimental results show that the neural network can extract complex and abstract features well,avoiding the complicated and tedious data processing and data analysis work in the early stage.Meanwhile,the accuracy rate is improved compared with the traditional feature extraction methods.
作者
姜羽菲
石宇
何若男
陈益
曹辉
JIANG Yufei;SHI Yu;HE Ruonan;CHEN Yi;CAO Hui(School of Physics and Information Technology,Shaanxi Normal University,Xi’an 710119,China)
出处
《电子设计工程》
2024年第20期26-30,共5页
Electronic Design Engineering
基金
国家自然科学基金(12374440)。
关键词
病理语音识别
梅尔谱图
卷积神经网络
时域金字塔匹配
pathological speech recognition
Merle spectral map
convolutional neural network
time domain pyramid matching