摘要
【目的】深度学习在鸟类物种识别的应用是目前的研究热点,为了进一步提高识别效果,提出一种基于鸟鸣声的Chirplet语图特征和深度卷积神经网络的鸟类物种识别方法。【方法】引入线性调频小波变换(Chirplet transform,CT)计算鸟鸣声信号的语图,输入深度卷积神经网络VGG16模型中,通过对语图进行分类实现鸟类物种的识别。以北京市松山国家自然保护区实地采集的18种鸟类为研究对象,利用Chirplet变换、短时傅里叶变换(short-time fourier transform,STFT)和梅尔频率倒谱变换(Mel frequency cepstrum transform,MFCT)计算得到3个不同的语图样本集,对比分别采用不同的语图样本集作为输入时鸟类物种识别模型的性能。【结果】结果表明:Chirplet语图作为输入时,测试集的平均识别准确率(mean average precision,MAP)达到0.987 1,相对于其他两种输入,得到了更高的MAP值,而且在训练时达到最大MAP值的迭代次数最小。【结论】采用不同的语图特征作为输入,直接影响深度学习模型的分类性能。本文计算的Chirplet语图的鸣声区域相比STFT语图和Mel语图更为集中,特征更明显。因此,Chirplet语图更适合于基于VGG16模型的鸟类物种识别,可以得到更高的MAP值和更快的识别效率。
[Objective] The application of deep learning in bird species recognition is the research hotspot at present. To improve the performance of recognition,a bird species recognition method based on Chirplet spectrogram feature and VGG16 model was proposed. [Method] Acoustic signal spectrograms were calculated by the Chirplet transform firstly,then spectrograms were inputted in the VGG16 model to realize the recognition of bird species. Taking eighteen bird species in Beijing Songshan National Nature Reserve as examples,through Chirplet transform,Fourier transform and Mel cepstrum transform,three spectrogram sample sets were calculated respectively,then using three kinds of spectrogram sample sets to train the recognition model,the performances of each input were compared. [Result] Results showed that with the Chirplet diagram input,the highest mean average precision( MAP) of the test set was 0. 987 1 compared with the other two inputs. Also,the epochs of the highest trainning MAP was thesmallest. [Conclusion] The choice of input affects the classification performance of deep learning model. The vocalization zone of Chirplet spectrogram is more concentrate and obvious than STFT spectrogram and Mel spectrogram,which means Chirplet spectrogram is more suitable for the bird recognition based on VGG16 model,higher MAP and efficiency of recognition can be achieved.
作者
谢将剑
李文彬
张军国
丁长青
Xie Jiangjian1, Li Wenbin1, Zhang Junguo1, Ding Changqing2(1. School of Technology, Beijing Forestry University, Beijiug 100083, China; 2. School of Nature Conservation, Beijing Forestry University, Beijing 100083, Chin)
出处
《北京林业大学学报》
CAS
CSCD
北大核心
2018年第3期122-127,共6页
Journal of Beijing Forestry University
基金
中央高校基本科研业务费专项(2017JC14)
国家重点研发项目(2017YFC1403503)
关键词
鸟类
线性调频小波变换
语图特征
深度卷积神经网络
物种识别
bird
Chirplet transform
spectrogram feature
deep convolutional neural network
species recognition