摘要
音源分离目前大多采用有监督的深度学习方法,这种方法往往需要利用大量的标签数据进行建模。然而,实际中标签数据并不容易获取且价格昂贵。针对无标签数据的音源分离只能采用有意义的先验知识来弥补标签数据的不足,对此,提出一种基于先验知识的U-Net模型。它既不影响现有卷积体系结构的网络复杂性,也不影响其收敛行为,但能显著改善分离后的音频质量。实验结果表明,所提出的方法的分离效果比传统模型更好。
At present, supervised deep learning method is mostly used in sound source separation, which often needs to use a large number of label data for modeling. However, in practice, label data is not easy to obtain and expensive. For the sound source separation of unlabeled data, we can only use meaningful a priori to make up for the deficiency of labeled data. Therefore, this paper proposes a u-net model based on a priori knowledge, which does not affect the network complexity of the existing convolution architecture or its convergence behavior, but it can significantly improve the audio quality after separation. The experimental results show that the separation effect of the proposed method is better than the traditional model.
作者
郭慧娴
GUO Huixian(Collage of Information Science and Technology,Beijing Normal University,Beijing 100875,China)
出处
《电声技术》
2022年第10期84-86,共3页
Audio Engineering