摘要
通过改进频率变换块以适应多源任务,并扩展了标准的U-Net进行多源分离。首先,提出一种基于复值谱图的条件机制网络,以捕获与源相关的时频模式;其次,采用潜在源注意力机制提取全局时频信息,建立长距离和层级化的时频依赖关系,根据重参数化结构丰富卷积块的特征空间,在不大量增加参数的前提下可以保持相同的性能;最后,在MUSDB源分离任务上的实验结果表明,所提方法和一些已有方法性能相当。
The frequency transform block was improved to accommodate multi-source tasks,and the standard U-Net was extended for multi-source separation.Firstly,a network of conditional mechanism based on complex-valued spectrograms was proposed to capture source-dependent time-frequency patterns.Secondly,potential source attention mechanism was applied to extract global time-frequency information and establish long-range and hierarchical dependencies relation.The feature space of the convolutional block was enriched according to the reparameterized structure to maintain the same performance without a large parameter increase.Finally,the experimental results on the MUSDB source separation task showed that the proposed method had the same performance as some existing methods.
作者
杨道武
陈文洁
陈爱斌
YANG Daowu;CHEN Wenjie;CHEN Aibin(College of Computer and Information Engineering,Central South University of Forestry and Technology,Changsha 410004, China;Institute of Artificial Intelligence Application, Central South University of Forestry and Technology, Changsha 410004, China)
出处
《郑州大学学报(理学版)》
北大核心
2022年第2期61-66,共6页
Journal of Zhengzhou University:Natural Science Edition
基金
国家自然科学青年基金项目(61703441)
智慧物流技术湖南省重点实验室项目(2019TP1015)。
关键词
音频源分离
重参数化
时频模式
条件机制
复值谱图
audio source separation
reparameterization
time-frequency mode
conditional mechanism
complex-valued spectrogram