摘要
常用的梅尔倒谱系数结合高斯混合模型(MFCC+GMM)方法的鸟鸣声识别技术难适应噪声环境,模型难以收敛,且计算复杂度高。该文提出一种融合声纹信息的能量谱图的鸟类识别方法(VPS-BR),该方法利用鸟类鸣声在能量谱图上所表现的多维差异性,定量识别鸣声声纹特征。通过对分贝能量进行颜色映射得到能量谱图,提取其视觉特征所表达的声学特征,分析归纳得到鸟类特有鸣声模式。在特征提取步骤中,选用识别速度快的局部二值模式、识别鲁棒性高的方向梯度直方图两个参数表征鸟鸣声谱图的边缘声纹;在识别步骤中,用局部二值模式和方向梯度直方图两种特征分别与支持向量机、K最近邻和随机森林3种分类器算法进行两两组合构建识别模型测试。对15种原始带噪鸟类鸣声数据集进行交叉验证,VPS-BR模型的平均识别率比MFCC+GMM组合模型高出11.3%,方向梯度直方图特征与K最近邻分类器的组合模型识别率达90.5%,表现出较好的抗噪性能和识别性能。最后针对样本数据集缺乏问题,使用生成对抗网络进行图像增强,进一步将识别率提升1.48%。
The bird’s voice recognition technology combined with the Mel-frequency cepstral coefficients and the Gaussian mixture model(MFCC+GMM)method is difficult to adapt to the noise environment,and its computational complexity is high.In this paper,a novel bird recognition method using voice-power spectrum(VPS-BR)to express acoustic features is proposed.It utilizes the multi-dimensional difference of bird sounds on the power spectrum to quantitatively identify the texture features of the sound.In the feature extraction step,the edge texture of the bird’s voice-power spectrum is characterized by local binary pattern(LBP)and direction gradient histogram(HOG);in the identification step,the VPS-BR model is constructed by combining LBP and HOG with support vector machine,K nearest neighbor(KNN)and random forest.The cross-validation of 15 original noisy bird sound data sets from the Xeno-Canto website shows that the recognition rate of the VPS-BR model is better than the MFCC+GMM model;HOG and KNN combined model recognition rate can reach 90.5%,shows good noise-reception recognition performance.Finally,for the lack of sample data set,image enhancement is made by using generated-adversarial-network,and the recognition rate is further increased by 1.48%.
作者
杨春勇
祁宏达
彭焱秋
尹滨
侯金
舒振宇
陈少平
YANG Chunyong;QI Hongda;PENG Yanqiu;YIN Bin;HOU Jin;SHU Zhenyu;CHEN Shaoping(Hubei Key Laboratory of Intelligent Wireless Communications,Wuhan 430074,China;College of Electronics and Information Engineering,South-Central University for Nationalities,Wuhan 430074,China)
出处
《应用声学》
CSCD
北大核心
2020年第3期453-463,共11页
Journal of Applied Acoustics
关键词
鸟类识别
能量谱图
局部二值模式
方向梯度直方图
生成对抗网络
Birds recognition
Power spectrogram
Histogram of oriented gradient
Local binary pattern
Generated-adversarial-network