期刊文献+

基于离散小波包变换与胶囊生成对抗网络的语音超分辨率算法

Speech Super-Resolution Algorithm Based on Discrete Wavelet Packet Transform and Capsule Generative Adversarial Network
下载PDF
导出
摘要 目前主流的语音超分辨率(Speech Super-Resolution,SSR)算法是使用卷积神经网络(Convolutional Neu-ral Networks,CNN)把低分辨率(Low-Resolution,LR)语音信号转换为高分辨率(High-Resolution,HR)的语音信号.但只使用普通的CNN所带来的效果通常比较平滑且缺少细节信息.生成对抗网络(Generative Adversarial Networks,GAN)的引入可以很好地解决这一问题.此外,胶囊网络(Capsule Networks,CapsNet)可以将空间信息编码为特征,这样与GAN结合可以更好地判断数据的真假.离散小波变换(Discrete Wavelet Transform,DWT)是一种正交多分辨分析的工具,它在信号处理方面有很出色的表现.小波变换的一个扩展是离散小波包变换(Discrete Wavelet Packet Transform,DWPT),它在某些应用中提供了更有效的信号分析.本文提出一种基于DWPT和胶囊生成对抗网络(CapsGAN)的SSR网络架构Wavelet-SRGAN.对比实验结果表明,本文所提的算法能以最少的参数实现与现有先进算法相当的性能.在算法上有几个核心步骤:(1)在生成器网络中加入DWPT层;(2)在鉴别器上加入胶囊网络;(3)训练时加入小波损失. The currently popular algorithms of speech super-resolution(SSR)use convolutional neural networks(CNN)to transform the low-resolution(LR)speech signal into high-resolution(HR)speech signal.However,the HR signal reconstructed from the ordinary CNN network is usually smooth and lack of details.Generative adversarial networks(GAN)can effectively solve this problem and generate high-quality speech signal.In addition,capsule networks(CapsNet)can encode the spatial information into features,and the combination with GAN will effectively improve the ability of dis-criminator.Moreover,discrete wavelet transform(DWT)is a tool for orthogonal multi-resolution analysis,which has excel-lent performance in signal processing.An extension of DWT is discrete wavelet packet transform(DWPT),which provides more efficient signal analysis in many applications.Based on the above mentioned DWPT and capsule generative adversari-al networks(CapsGAN),we propose an SSR network architecture in this paper,named as Wavelet-SRGAN.Comparative experiment results show that the proposed Wavelet-SRGAN can achieve comparable performance against current state-of-the-art methods with the least amount of parameters.The key steps and main contributions of our algorithm are as fol-lows:(1)adding a DWPT layer to the generator networks;(2)imbedding a capsule network in the discriminator;(3)addi-tional wavelet loss is considered in the training process.
作者 陈习坤 杨俊美 CHEN Xi-kun;YANG Jun-mei(School of Electronic and Information Engineering,South China University of Technology,Guangzhou,Guangdong 510640,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2023年第4期1039-1049,共11页 Acta Electronica Sinica
基金 国家自然科学基金(No.61871188,No.61801133)。
关键词 语音超分辨率 生成对抗网络 离散小波变换 离散小波包变换 小波损失 speech super-resolution generative adversarial networks discrete wavelet transform discrete wavelet packet transform wavelet loss
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部