Attention-based neural network for end-to-end music separation

下载PDF

导出

摘要 The end-to-end separation algorithm with superior performance in the field of speech separation has not been effectively used in music separation.Moreover,since music signals are often dual channel data with a high sampling rate,how to model longsequence data and make rational use of the relevant information between channels is also an urgent problem to be solved.In order to solve the above problems,the performance of the end-to-end music separation algorithm is enhanced by improving the network structure.Our main contributions include the following:(1)A more reasonable densely connected U-Net is designed to capture the long-term characteristics of music,such as main melody,tone and so on.(2)On this basis,the multi-head attention and dualpath transformer are introduced in the separation module.Channel attention units are applied recursively on the feature map of each layer of the network,enabling the network to perform long-sequence separation.Experimental results show that after the introduction of the channel attention,the performance of the proposed algorithm has a stable improvement compared with the baseline system.On the MUSDB18 dataset,the average score of the separated audio exceeds that of the current best-performing music separation algorithm based on the time-frequency domain(T-F domain).

作者 Jing Wang Hanyue Liu Haorong Ying Chuhan Qiu Jingxin Li Muhammad Shahid Anwar

机构地区 Beijing Institute of Technology Communication University of China China Electronics Standardization Institute Gachon University

出处《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期355-363,共9页 智能技术学报（英文）

基金 National Natural Science Foundation of China,Grant/Award Number:62071039 Beijing Natural Science Foundation,Grant/Award Number:L223033。

关键词 channel attention densely connected network end-to-end music separation

分类号 TN9 [电子电信—信息与通信工程]

引文网络
相关文献

1Rishikesh Magar,Yuyang Wang,Amir Barati Farimani.Crystal twins:self-supervised learning for crystalline material property prediction[J].npj Computational Materials,2022(1):2210-2217. 被引量：1
2刘鹏程,田立斌,冯杰,孟春旅,田泽伟,陈俞伊,陈泰谷,郭浩然.基于S变换与时频域反射的电缆缺陷定位方法[J].绝缘材料,2023,56(3):47-53. 被引量：1
3E. E. Escultura.Extensions of the Constructivist Real Number System[J].Advances in Pure Mathematics,2018,8(8):720-754.
4Sonali S.Patil,Sujit S.Pardeshi,Abhishek D.Patange.Health Monitoring of Milling Tool Inserts Using CNN Architectures Trained by Vibration Spectrograms[J].Computer Modeling in Engineering & Sciences,2023(7):177-199. 被引量：1
5Fadi M. Al-Ghawanmeh,Mohammad T. Al-Ghawanmeh,Mohammad W. Abed.Real-Time Maqam Estimation Model in Max/MSP Configured for the Nāy[J].International Journal of Communications, Network and System Sciences,2016,9(2):39-52.
6Tianyu Han,Jiapeng Lei,Yang Liu,Yanan Wang,Wenze Xun,Qifan Hu,Qi Peng,Wei Zhang.NSP16 promotes the expression of TMPRSS2 to enhance SARS-CoV-2 cell entry[J].Genes & Diseases,2023,10(3):723-726.
7Zhandong Huang,Shengdong Zhao,Yiyuan Zhang,Zheren Cai,Zheng Li,Junfeng Xiao,Meng Su,Qiuquan Guo,Chuanzeng Zhang,Yaozong Pan,Xiaobing Cai,Yanlin Song,Jun Yang.Tunable Fluid-Type Metasurface for Wide-Angle and Multifrequency Water-Air Acoustic Transmission[J].Research,2022(1):111-124.
8Yang Yuan,Jingwen Wang,Wenbo Shi,Xinyi Bai,Ge Li,Zhengyu Bai,Lin Yang.Optimizing oxygen redox kinetics of M-N-C electrocatalysts via an in-situ self-sacrifice template etching strategy[J].Chinese Chemical Letters,2023,34(5):327-331. 被引量：1
9JI JING.DEVELOPMENT IN HARMONY[J].Beijing Review,2023,66(23):12-15.
10BAI Haojun,ZHANG Tianqi,LIU Jianxing,YE Shaopeng.Monaural speech enhancement combining accurate ratio mask and deep neural network[J].Chinese Journal of Acoustics,2022,41(4):373-389.

CAAI Transactions on Intelligence Technology

2023年第2期

浏览历史

内容加载中请稍等...

Attention-based neural network for end-to-end music separation

相关作者

相关机构

相关主题

浏览历史