摘要
在频域操作的联合盲源分离算法可以有效解决频点间的内部排序问题,然而对于输出通道的排序,即全局排序,现有的基于频域的联合盲源分离算法仍无法有效确定.使用基于变分自编码器的声源模型,通过预先指定从注册语料中获得的各话者编码向量的排列顺序来调控分离后输出通道的顺序.该方法使用带有实例归一化和自适应实例归一化层的变分自编码器来确保这一排序方式的有效性.此外,为了减少频域上联合盲源分离算法可能出现的块排序问题,提出使用人为构造的两类含噪信号对变分自编码器中的解码器网络单独进行降噪训练的方案.利用实际录制的房间冲激响应的仿真结果表明,该方案可以在保证算法分离性能的同时,有效地按照预期的输出顺序调控输出通道.
Frequency domain joint blind source separation(FD-JBSS)has been demonstrated as an effective method to deal with the internal permutation problem.However,the ordering of the output channels cannot be efficiently arranged or recognized using current FD-JBSS algorithms,resulting in an unsettled global permutation problem.In this paper,this problem is addressed simultaneously in the separation process by pre-assigning the order of speaker embeddings extracted from enrollment utterances with a variational autoencoder(VAE)-based source model.Two normalizaton strategies,i.e.instance normalization(IN)and adaptive instance normalizaton(AdaIN)are utilized in the VAE architecture to enforce an arbitrary channel permutation.To mitigate the possible block permutation problem of FD-JBSS and further improving the separation performance,we propose a denoising training stage solely to the decoder network using two kinds of artificially constructed noisy signals.Separation performance and accuracy of output channel arrangement of the proposed method are evaluated using measured room impulse responses(RIRs)using both seen and unseen speakers.
作者
顾昭仪
卢晶
Zhaoyi Gu;Jing Lu(Key Laboratory of Modern Acoustics,Ministry of Education,Institute of Acoustics of Nanjing University,Nanjing,210093,China)
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2021年第4期671-682,共12页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(11874219)。
关键词
联合盲源分离
全局排序
变分自编码器
实例归一化
joint blind source separation
global permutation
variational autoencoder
instance normalization