期刊文献+

融合声纹信息的能量谱图在鸟类识别中的研究 被引量:4

Research on the application of energy spectrum with voiceprint information in bird recognition
下载PDF
导出
摘要 常用的梅尔倒谱系数结合高斯混合模型(MFCC+GMM)方法的鸟鸣声识别技术难适应噪声环境,模型难以收敛,且计算复杂度高。该文提出一种融合声纹信息的能量谱图的鸟类识别方法(VPS-BR),该方法利用鸟类鸣声在能量谱图上所表现的多维差异性,定量识别鸣声声纹特征。通过对分贝能量进行颜色映射得到能量谱图,提取其视觉特征所表达的声学特征,分析归纳得到鸟类特有鸣声模式。在特征提取步骤中,选用识别速度快的局部二值模式、识别鲁棒性高的方向梯度直方图两个参数表征鸟鸣声谱图的边缘声纹;在识别步骤中,用局部二值模式和方向梯度直方图两种特征分别与支持向量机、K最近邻和随机森林3种分类器算法进行两两组合构建识别模型测试。对15种原始带噪鸟类鸣声数据集进行交叉验证,VPS-BR模型的平均识别率比MFCC+GMM组合模型高出11.3%,方向梯度直方图特征与K最近邻分类器的组合模型识别率达90.5%,表现出较好的抗噪性能和识别性能。最后针对样本数据集缺乏问题,使用生成对抗网络进行图像增强,进一步将识别率提升1.48%。 The bird’s voice recognition technology combined with the Mel-frequency cepstral coefficients and the Gaussian mixture model(MFCC+GMM)method is difficult to adapt to the noise environment,and its computational complexity is high.In this paper,a novel bird recognition method using voice-power spectrum(VPS-BR)to express acoustic features is proposed.It utilizes the multi-dimensional difference of bird sounds on the power spectrum to quantitatively identify the texture features of the sound.In the feature extraction step,the edge texture of the bird’s voice-power spectrum is characterized by local binary pattern(LBP)and direction gradient histogram(HOG);in the identification step,the VPS-BR model is constructed by combining LBP and HOG with support vector machine,K nearest neighbor(KNN)and random forest.The cross-validation of 15 original noisy bird sound data sets from the Xeno-Canto website shows that the recognition rate of the VPS-BR model is better than the MFCC+GMM model;HOG and KNN combined model recognition rate can reach 90.5%,shows good noise-reception recognition performance.Finally,for the lack of sample data set,image enhancement is made by using generated-adversarial-network,and the recognition rate is further increased by 1.48%.
作者 杨春勇 祁宏达 彭焱秋 尹滨 侯金 舒振宇 陈少平 YANG Chunyong;QI Hongda;PENG Yanqiu;YIN Bin;HOU Jin;SHU Zhenyu;CHEN Shaoping(Hubei Key Laboratory of Intelligent Wireless Communications,Wuhan 430074,China;College of Electronics and Information Engineering,South-Central University for Nationalities,Wuhan 430074,China)
出处 《应用声学》 CSCD 北大核心 2020年第3期453-463,共11页 Journal of Applied Acoustics
关键词 鸟类识别 能量谱图 局部二值模式 方向梯度直方图 生成对抗网络 Birds recognition Power spectrogram Histogram of oriented gradient Local binary pattern Generated-adversarial-network
  • 相关文献

参考文献5

二级参考文献78

  • 1苏秀,朱曦.鸟声研究进展[J].浙江林学院学报,2006,23(3):323-327. 被引量:6
  • 2宫晓梅,王怀阳.噪声环境下MFCC特征提取[J].微计算机信息,2007,23(22):247-249. 被引量:9
  • 3Baker, M. C. and Cunningham, M. A. 1985. The biology of bird-song dialects. Behavioral and Brain Sciences, 8: 85-133.
  • 4Baptista, L. F. 1975. Song dielects and demes in sedentary population of the White-crowned Sparrow (Zonotrichia leucophrys nuttalli ).Publ. Zool. Univ. Calif., 105: 1-52.
  • 5Baptista, L. F., and Schuchmann, K. L. 1990. Song learning in the Anna Hummingbird (Calypte anna). Ethology, 84: 15-26.
  • 6Becking, J. H. 1975. New evidence of the species affinity of Cuculus lepidus Müller. Ibis, 117: 275-284.
  • 7Bertram, B. 1970. The vocal behavior of the Indian hill mynah, Gracula religiosa. Anim. Behav. Monogr. , 3: 79-192.
  • 8Burton, J. A. 1973. Owls of the world ( revised edition). Peter Lowe,London. 208 pp.
  • 9Catchpole, C. K. 1976. Temporal and sequential organisation of song in the Sedge Warbler (Acrocephalus schoenobaenus). Behaviour, 59:226-246.
  • 10Catchpole, C. K. and Slater, P. J. B. 1995. Bird Song: Biological themes and Variations. Cambridge Univ. Press. P. 248.

共引文献187

同被引文献38

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部