摘要
近年来,基于深度学习的轻量级单幅图像超分辨率(Single Image Super-Resolution,SISR)重建网络已成为人们研究的热点.但是现有的轻量级方法在捕捉图像像素间长距离的全局依赖性方面存在显著局限,这主要是由于显式建模此类依赖关系所伴随的庞大计算复杂度所致.因此现有的轻量级SISR方法的性能仍有较大的提升空间.基于此,本论文提出了一种新颖的基于Transformer的块内块间双聚合的轻量级网络(Intra-block and Interblock Dual Aggregation Network,IIDAN)来显式捕捉整幅图像中的全局依赖性,进而实现高质量的SISR.首先,在自然图像的非局部结构相似性的启发下,本论文提出了一种新颖的块内块间Transformer模块(Intra-block and Inter-block Transformer Module,IITM).IITM通过交替地开发每个图像块内部的自注意力和不同图像块之间的自注意力实现了图像中局部特征的显式捕捉和图像中结构相似性的全局显式捕捉.其次,本论文还提出了一种信息交互机制(Information Interaction Mechanism,IIM)来分别对IITM中的两种自注意力进行对应信息的互补:IIM给块内自注意力(Intra-block Transformer,Intra-T)补充块间信息,使得Intra-T能够获得更多的全局结构信息;同时,IIM也给块间自注意力(Inter-block Transformer,Inter-T)补充局部信息,使得Inter-T能够获得更多的局部细节信息.实验结果表明,与近几年极具代表性的轻量级SISR方法相比,本论文提出的IIDAN能够重建出更高质量的超分辨率图像,同时具有更低的计算复杂度.
The single Image super-resolution(SISR)reconstruction task is an ill-posed and challenging inverse problem,which is the research hotspot in low-level computer vision tasks.SISR attempts to reconstruct a clean high-resolution(HR)image with rich and natural texture details from its low-resolution(LR)version,which is crucial in various computer vision fields.Recently,lightweight networks for SISR have increased in popularity,and numerous lightweight SISR networks have been proposed for various practical applications.The landscape of deep learning has witnessed a significant surge in interest in lightweight SISR techniques,which have proven to be powerful tools for the enhancement of image quality.Despite their potential,these techniques often encounter a critical challenge:the difficulty in capturing the intricate,long-range interdependencies between pixels within an image.This limitation,primarily due to computational constraints,restricts the full realization of the capabilities of lightweight SISR algorithms,indicating a substantial area for improvement.In response to this challenge,we introduce a pioneering solution with the development of the Intra-block and Inter-block Dual Aggregation Network(IIDAN),a transformer-based,lightweight network architecture.Carefully designed,the IIDAN framework is engineered to explicitly capture the global dependencies that exist within images,thereby significantly enhancing the quality of SISR results.Our innovation is anchored in the understanding of the inherent non-local structural similarities present in natural images.Building upon this insight,we have crafted the Intra-block and Interblock Transformer Module(IITM),a novel module that adeptly manages self-attention mechanisms at two distinct levels.The first level operates within a single block,referred to as the intra-block transformer(Intra-T),while the second level functions across different blocks,known as the inter-block transformer(Inter-T).By seamlessly alternating between these two attention mechanisms,the IITM integrates the extraction of complex local features with the recognition of broad global structural patterns,providing a comprehensive analysis of the image.Moreover,we have introduced the Information Interaction Mechanism(IIM)as a strategic enhancement to the IITM.This mechanism intelligently blends the strengths of intra-T and inter-T,using insights from inter-block information to enrich intra-block attention.This approach not only expands the scope of structural understanding but also ensures that a broader perspective does not compromise the detailed understanding of fine-grained details.Simultaneously,the interblock attention is reinforced by the detailed local information from intra-block attention,ensuring a balanced and holistic approach to image analysis.The effectiveness of our IIDAN methodology is evidenced by a series of experiments.These experiments demonstrate that IIDAN not only stands its ground but also surpasses the most respected lightweight SISR methods of recent times.Our framework commendably strikes a balance between minimal parameterization and reduced computational complexity,consistently producing super-resolution images of exceptional quality.This achievement is a testament to the innovative design and meticulous implementation of IIDAN.In conclusion,the IIDAN presents a solution that is both computationally efficient and capable of generating high-fidelity super-resolution images.Its dual attention mechanism,complemented by the strategic Information Interaction Mechanism,positions the IIDAN as a leading contender in the pursuit of superior image quality enhancement.
作者
唐述
曾琬凌
杨书丽
钟恒飞
陈卓
TANG Shu;ZENG Wan-Ling;YANG Shu-Li;ZHONG Heng-Fei;CHEN Zhuo(Chongqing Key Laboratory of Computer Network and Communications Technology,College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065)
出处
《计算机学报》
EI
CAS
CSCD
北大核心
2024年第12期2783-2802,共20页
Chinese Journal of Computers
基金
国家自然科学基金项目(No.61601070)
重庆市自然科学基金面上项目(CSTB2023NSCQ-MSX0680)
重庆市教育委员会科学技术研究重大项目(KJZD-M202300101)
重庆邮电大学博士研究生创新人才项目(BYJS202217)资助.