基于Transformer的深度条件视频压缩被引量：1

A Transformer based deep conditional video compression

下载PDF

导出

摘要近年来,基于深度学习的视频压缩技术主要基于卷积神经网络(CNN)且采用运动补偿-残差编码的架构,由于常见的CNN只能利用局部的相关性,以及预测残差本身的稀疏特性,难以取得最优压缩性能。因此,提出一种基于Transformer架构的条件视频压缩算法,以实现更优的压缩效果。所提算法基于前后帧之间的运动信息,利用可形变卷积得到对应的预测帧特征;将预测帧特征作为条件信息,对原始输入帧特征进行条件编码,避免了直接编码稀疏的残差信号;利用特征间的非局部相关性,提出一个基于Transformer的深度条件视频压缩编码算法,用来实现运动信息编码和条件编码,进一步提升压缩编码的性能。实验结果表明:所提算法在HEVC、UVG数据集上均超越了当前主流的基于深度学习的视频压缩算法。 Convolutional neural networks(CNN)are the foundation of most recent learning-based video compression algorithms,which also use residual coding and motion compensation architectures.It is difficult to attain the best compression performance given that typical CNN can only use local correlations and the sparsity of prediction residual.To solve the problems above,this paper proposed a Transformer-based deep conditional video compression algorithm,which can achieve better compression performance.The proposed algorithm uses deformable convolution to obtain the predicted frame feature based on the motion information between the front and rear frames.The predicted frame feature is used as conditional information to conditionally encode the original input frame feature which avoids the direct encoding of sparse residual signals.The proposed algorithm further utilizes the non-local correlation between the features and proposes a transformer-based autoencoder architecture to implement motion coding and conditional coding,which further improves the performance of compression.Experiments show that our Transformer based deep conditional video compression algorithm surpasses the current mainstream learning-based video compression algorithms in both HEVC and UVG datasets.

作者鲁国钟天雄耿晶 LU Guo;ZHONG Tianxiong;GENG Jing(School of Computer Science and Engineering,Beijing Institute of Technology,Beijing 100081,China)

机构地区北京理工大学计算机学院

出处《北京航空航天大学学报》 EI CAS CSCD 北大核心 2024年第2期442-448,共7页 Journal of Beijing University of Aeronautics and Astronautics

基金国家自然科学基金(62102024)。

关键词视频压缩 TRANSFORMER 深度学习神经网络压缩算法 video compression Transformer deep learning neural network compression algorithm

分类号 TP37 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

同被引文献8

1沙丽娜,王宁章,覃团发.视频压缩编码国际标准综述[J].电子科技,2005,18(3):56-60. 被引量：7
2翟正利,梁振明,周炜,孙霞.变分自编码器模型综述[J].计算机工程与应用,2019,55(3):1-9. 被引量：65
3王洪军.超高清数字电视视频压缩编码技术与发展趋势分析[J].数字通信世界,2019,0(11):96-96. 被引量：4
4郭红伟,朱策,刘宇洋.视频编码率失真优化技术研究综述[J].电子学报,2020,48(5):1018-1029. 被引量：11
5施凯杰,石翠萍,曾泽鑫,蒋吉娟,邹立颖.基于变分自编码器的多光谱图像压缩网络[J].齐齐哈尔大学学报（自然科学版）,2023,39(5):31-38. 被引量：1
6郭秋燕,胡磊,代劲.基于云模型的变分自编码器数据压缩方法[J].电子技术应用,2023,49(10):96-99. 被引量：2
7訾峰.广播电视技术与地面数字电视传输技术应用[J].卫星电视与宽带多媒体,2024(10):10-12. 被引量：1
8刘晓华.高清视频传输与处理技术分析[J].电子技术（上海）,2024,53(3):282-283. 被引量：1

引证文献1

1周立新.基于深度学习的视频压缩编码技术研究[J].电视技术,2024,48(9):13-16.

1高晓静,艾文文,王博妮,张岚,许福根.基于云储存的气象数据动态可视化重建算法设计[J].电子设计工程,2023,31(2):11-15.
2龙廷波.基于NDI+5G+4K超高清小型转播车集成应用[J].影视制作,2024,30(1):72-75.
3王超,马驰,刘荣,汪磊,李东.基于NBVTH算法的军用窄带信道视频传输研究[J].火力与指挥控制,2024,49(1):139-143.

北京航空航天大学学报

2024年第2期

浏览历史

内容加载中请稍等...

基于Transformer的深度条件视频压缩被引量：1

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于Transformer的深度条件视频压缩 被引量：1

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于Transformer的深度条件视频压缩被引量：1