基于离散小波包变换与胶囊生成对抗网络的语音超分辨率算法

Speech Super-Resolution Algorithm Based on Discrete Wavelet Packet Transform and Capsule Generative Adversarial Network

下载PDF

导出

摘要目前主流的语音超分辨率(Speech Super-Resolution,SSR)算法是使用卷积神经网络(Convolutional Neu-ral Networks,CNN)把低分辨率(Low-Resolution,LR)语音信号转换为高分辨率(High-Resolution,HR)的语音信号.但只使用普通的CNN所带来的效果通常比较平滑且缺少细节信息.生成对抗网络(Generative Adversarial Networks,GAN)的引入可以很好地解决这一问题.此外,胶囊网络(Capsule Networks,CapsNet)可以将空间信息编码为特征,这样与GAN结合可以更好地判断数据的真假.离散小波变换(Discrete Wavelet Transform,DWT)是一种正交多分辨分析的工具,它在信号处理方面有很出色的表现.小波变换的一个扩展是离散小波包变换(Discrete Wavelet Packet Transform,DWPT),它在某些应用中提供了更有效的信号分析.本文提出一种基于DWPT和胶囊生成对抗网络(CapsGAN)的SSR网络架构Wavelet-SRGAN.对比实验结果表明,本文所提的算法能以最少的参数实现与现有先进算法相当的性能.在算法上有几个核心步骤:(1)在生成器网络中加入DWPT层;(2)在鉴别器上加入胶囊网络;(3)训练时加入小波损失. The currently popular algorithms of speech super-resolution(SSR)use convolutional neural networks(CNN)to transform the low-resolution(LR)speech signal into high-resolution(HR)speech signal.However,the HR signal reconstructed from the ordinary CNN network is usually smooth and lack of details.Generative adversarial networks(GAN)can effectively solve this problem and generate high-quality speech signal.In addition,capsule networks(CapsNet)can encode the spatial information into features,and the combination with GAN will effectively improve the ability of dis-criminator.Moreover,discrete wavelet transform(DWT)is a tool for orthogonal multi-resolution analysis,which has excel-lent performance in signal processing.An extension of DWT is discrete wavelet packet transform(DWPT),which provides more efficient signal analysis in many applications.Based on the above mentioned DWPT and capsule generative adversari-al networks(CapsGAN),we propose an SSR network architecture in this paper,named as Wavelet-SRGAN.Comparative experiment results show that the proposed Wavelet-SRGAN can achieve comparable performance against current state-of-the-art methods with the least amount of parameters.The key steps and main contributions of our algorithm are as fol-lows:(1)adding a DWPT layer to the generator networks;(2)imbedding a capsule network in the discriminator;(3)addi-tional wavelet loss is considered in the training process.

作者陈习坤杨俊美 CHEN Xi-kun;YANG Jun-mei(School of Electronic and Information Engineering,South China University of Technology,Guangzhou,Guangdong 510640,China)

机构地区华南理工大学电子与信息学院

出处《电子学报》 EI CAS CSCD 北大核心 2023年第4期1039-1049,共11页 Acta Electronica Sinica

基金国家自然科学基金(No.61871188,No.61801133)。

关键词语音超分辨率生成对抗网络离散小波变换离散小波包变换小波损失 speech super-resolution generative adversarial networks discrete wavelet transform discrete wavelet packet transform wavelet loss

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1徐峰,李平.基于FFTNet-GAN的音频超分辨率方法研究[J].信号处理,2021,37(1):59-65. 被引量：2

共引文献1

1王永红.基于高频信息特征融合的超分辨率对抗网络研究[J].信息与电脑,2023,35(11):103-105.

1安子恒,徐超,冯博,韩俱宝.基于亮度校正和融合通道先验的内窥镜图像增强算法[J].计算机科学,2023,50(S01):288-294.
2Wenbo Miao,Qi Li,Junhong Li,Jingyun Zhou,Xiaoli Cheng.Corrigendum to“Thermal Environment and Aeroheating Mechanism of Protuberances on Mars Entry Capsule”[J].Space(Science & Technology),2022(1):131-131.
3Jun Zhang,Xinxin Wang,Jingyan Liu,Dongfang Zhang,Yin Lu,Yuhong Zhou,Lei Sun,Shenglin Hou,Xiaofei Fan,Shuxing Shen,Jianjun Zhao.Multispectral Drone Imagery and SRGAN for Rapid Phenotypic Mapping of Individual Chinese Cabbage Plants[J].Plant Phenomics,2022,4(1):1-11. 被引量：1
4赵万里,郭迎清,徐柯杰,王灿森,应豪杰,陶欣昕.航空发动机多电分布式控制系统故障诊断与容错关键技术综述[J].航空学报,2023,44(10):12-31. 被引量：3
5阳小飞教授团队在Advanced Drug Delivery Reviews杂志发表论文[J].中南民族大学学报（自然科学版）,2023,42(4).
6董关,刘宁,孙明亮.再生式液体发射药火炮燃烧室压力噪声分析[J].火炮发射与控制学报,2023,44(3):67-72.
7Mingyuan Gao,Jun Lu,Yifeng Wang,Ping Wang,Li Wang.Smart monitoring of underground railway by local energy generation[J].Underground Space,2017,2(4):210-219.
8Nasim Eslamirad,Francesco De Luca,Kimmo Sakari Lylykangas,Sadok Ben Yahia.Data generative machine learning model for the assessment of outdoor thermal and wind comfort in a northern urban environment[J].Frontiers of Architectural Research,2023,12(3):541-555.
9Acupuncture[J].China Medical Abstracts(Internal Medicine),2023,40(1):5-13.
10Hanyi WANG,Renjie LUO,Qun YU,Zhiyi LI.Topology-independent end-to-end learning model for improving the voltage profile in microgrids-integrated power distribution networks[J].Frontiers in Energy,2023,17(2):211-227.

电子学报

2023年第4期

浏览历史

内容加载中请稍等...

基于离散小波包变换与胶囊生成对抗网络的语音超分辨率算法

参考文献1

共引文献1

相关作者

相关机构

相关主题

浏览历史