摘要
针对多个鸟类个体同时发声导致的鸣声混叠问题,本文提出了一种融合录音通道间空间特征的鸟类鸣声分离方法.该方法将混叠鸣声信号的声谱特征和空间特征作为分离模型的输入,提出深度学习模型U-Conformer来预测每个鸣声源方向的幅值谱掩膜(spectral magnitude mask,SMM),通过模型估计的SMM从混叠鸣声信号中恢复每个鸣声源信号.由多源混叠鸟类鸣声数据的实验结果表明,本文提出的分离方法较其他深度学习模型结构具有更好的分离效果,有助于更好地分析野外鸟类鸣声录音.
Simultaneous vocalization of multiple birds leads to overlapping bird sound.In this paper a bird sound separation method,with integrated spatial features,is proposed.In this method,both spectral and spatial features of overlapped sound signals are used as input,U-Conformer is used as a separation model to predict spectral magnitude mask(SMM).The sound source signal is recovered from mixed sound signal by estimated SMM.The generated multi-channel bird sound data confirm that this method has better performance in bird sound separation compared with existing methods.
作者
倪东明
石煜炜
夏灿玮
谢将剑
NI Dongming;SHI Yuwei;XIA Canwei;XIE Jiangjian(School of Technology,Beijing Forestry University,100083,Beijing,China;Research Center for Biodiversity Intelligent Monitoring,Beijing Forestry University,100083,Beijing,China;Ministry of Education Key Laboratory for Biodiversity and Ecological Engineering,College of Life Sciences,Beijing Normal University,100875,Beijing,China)
出处
《北京师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第3期388-395,共8页
Journal of Beijing Normal University(Natural Science)
基金
中国高校产学研创新基金资助项目(2021LDA05002)
中央高校基本科研业务费资助项目(2021ZY70)
北京市自然科学基金资助项目(6214040)