采用多尺度视觉注意力分割腹部CT和心脏MR图像被引量：3

Segmentation of abdominal CT and cardiac MR images with multi scale visual attention

导出

摘要目的医学图像分割是计算机辅助诊断和手术规划的重要步骤,但是由于人体器官结构复杂、组织边缘模糊等问题,其分割效果还有待提高。由于视觉Transformer(vision Transformer,ViT)在计算机视觉领域取得了成功,受到医学图像分割研究者的青睐。但是基于ViT的医学图像分割网络,将图像特征展平成一维序列,忽视了图像的二维结构,且ViT所需的计算开销相当大。方法针对上述问题,提出了以多尺度视觉注意力(multi scale visual attention,MSVA)为基础、Transformer作为主干网络的U型网络结构MSVA-TransUNet。其采用的多尺度视觉注意力是一种由多个条状卷积实现的注意力机制,采用一个条状卷积对近似一个大核卷积的操作,采用不同的条状卷积对近似不同的大核卷积,从不同的尺度获取图像的信息。结果在腹部多器官分割和心脏分割数据集上的实验结果表明:本文网络与基线模型相比,平均Dice分别提高了3.74%和1.58%,其浮点数运算量是多头注意力机制的1/278,网络参数量为15.31 M,是TransUNet的1/6.88。结论本文网络媲美当前较先进的网络TransUNet和SwinUNet,采用多尺度视觉注意力代替多头注意力,在减少计算开销的同时在分割性能上同样具有优势。本文代码开源地址:https://github.com/BeautySilly/VA-TransUNet。 Objective Medical image segmentation is one of the important steps in computer-aided diagnosis and surgery planning.However,due to the complex,diverse structure of various human organs,blurred tissue edges,size,and other problems,the segmentation performance is poor and the segmentation effect needs to be further improved,while more accurate segmentation performance can more effectively help doctors to carry out treatment and provide advice.Recently,deeplearning-based methods have become a hot spot for researching medical image segmentation.Vision Transformer(ViT),which has achieved great success in the field of natural language processing,has also flourished in the field of computer vision;therefore,it is favored by medical image segmentation researchers.However,current medical image segmentation networks based on ViT flatten image features into 1D sequences,ignoring the 2D structure of images and the connections between them.Moreover,the secondary computational complexity of the multihead self-attention(MHSA)mechanism of ViT increases the required computational overhead.Method To address the above problems,this paper proposes MSVATransUNet,a U-shaped network structure with Transformer as the backbone network based on multi scale vision attention,an attention mechanism implemented by multiple stripe convolutions.The structure is similar to the multihead attention mechanism,which uses convolutional operations to obtain long-distance dependencies.First,the network uses convolution kernels of different sizes to extract features of images of dissimilarsizes,uses a pair of strip convolution operations to approximate a large kernel convolution instead,and does not use dissimilarsizes of strip convolution to approximate diverse large kernel convolutions,which can capture local information using convolution,while large convolution kernels can also learn long-distance dependence of images.Second,strip convolution belongs to lightweight convolution,which can remarkably reduce the number of parameters and floating-point operations of the network and lower the required computational overhead,because the computational overhead of convolution is much smaller than the overhead required by the secondary computational complexity of multihead attention.Further,it avoids converting the image into a 1D sequence for input to vision Transformer and makes full use of the 2D structure of the image to learn the features of the image.Finally,replacing the first patch embedding in the encoding stage with a convolution stem avoids directly converting low channel counts to high channel counts,which runs counter to the typical structure of convolutional neural networks(CNNs)while retaining the structure of patch embeddings elsewhere.Result Experimental results on the abdominal multiorgan segmentation dataset(mainly containing eight organs)and the heart segmentation dataset(comprising three parts of the heart)show the segmentation accuracy of the proposed network in this paper is improved compared with the baseline model.The average Dice of the abdominal multiorgan segmentation dataset is improved by 3.74%,and the average Dice of the heart segmentation dataset is improved by 1.58%.Their floating-point operations and number of parameters are reduced compared with the MHSA mechanism and the large kernel convolution.The MHSA mechanism’s floating-point operation is 1/278 of the selfattention mechanism,and the number of network parameters is 15.31 M,which is 1/6.88 of the TransUNet.Conclusion Experimental results show the proposed network is comparable to or even exceeds the current state-of-the-art networks.The multiscale visual attention mechanism is used instead of the multihead self-attention mechanism,which can also capture long-distance relationships and extract graphic long-distance features.Segmentation performance is improved while reducing computational overhead,that is,the proposed network exhibit certain advantages.However,due to the specificity of the location and small size of some organs,the networks do not have enough feature learning ability for this part of the organs;hence,its segmentation accuracy still needs to be further improved,and we will continue to study how to improve the segmentation performance of this part of the organs in depth.The code of this paper will be open source soon:https://github.com/BeautySilly/VA-TransUNet.

作者蒋婷李晓宁 Jiang Ting;Li Xiaoning(College of Computer Science,Sichuan Normal University,Chengdu 610101,China;College of Intelligent Science and Technology,Geely University,Chengdu 641423,China;Visual Computing and Virtual Reality Key Laboratory of Sichuan Province,Chengdu 610066,China)

机构地区四川师范大学计算机科学学院吉利学院智能科技学院可视化计算与虚拟现实四川省重点实验室

出处《中国图象图形学报》 CSCD 北大核心 2024年第1期268-279,共12页 Journal of Image and Graphics

关键词医学图像分割视觉注意力 TRANSFORMER 注意力机制腹部多器官分割心脏分割 medical image segmentation visual attention Transformer attention mechanism abdominal multi-organ segmentation cardiac segmentation

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1殷晓航,王永才,李德英.基于U-Net结构改进的医学影像分割技术综述[J].软件学报,2021,32(2):519-550. 被引量：51
2郑光远,刘峡壁,韩光辉.医学影像计算机辅助检测与诊断系统综述[J].软件学报,2018,29(5):1471-1514. 被引量：72

二级参考文献15

1姚旭,王晓丹,张玉玺,权文.特征选择方法综述[J].控制与决策,2012,27(2):161-166. 被引量：207
2孔英会,景美丽.基于混淆矩阵和集成学习的分类方法研究[J].计算机工程与科学,2012,34(6):111-117. 被引量：46
3秦菊,白红利,刘畅,余建群,张洪静,张泽江,李为民,张丽芝.计算机辅助诊断在数字化胸片肺结节早期检出中的应用[J].生物医学工程学杂志,2014,31(5):1117-1120. 被引量：5
4张国鹏,廖琪梅,焦纯,李宝娟,刘洋,卢虹冰.虚拟结肠镜的计算机辅助诊断技术[J].西安电子科技大学学报,2015,42(2):157-161. 被引量：6
5崔颖,王小鹏,于挥,陈璐.肺实质CT图像分割方法[J].计算机工程与设计,2015,36(5):1274-1278. 被引量：5
6李秋萍,刘慧,苏志远.基于改进的半监督FCM聚类算法的肺结节分类与识别[J].图学学报,2015,36(2):244-250. 被引量：2
7谭婉嫦,王金花,蔡洪明,杨曦,李立.基于微钙化检测的计算机辅助诊断系统对于乳腺导管原位癌的诊断价值[J].临床放射学杂志,2016,35(9):1352-1356. 被引量：8
8蒋宏达,叶西宁.一种改进的I-Unet网络的皮肤病图像分割算法[J].现代电子技术,2019,42(12):52-56. 被引量：15
9徐宝泉,凌彤辉.基于级联Vnet-S网络的CT影像单一器官自动分割算法[J].计算机应用,2019,39(8):2420-2425. 被引量：7
10洪炎佳,孟铁豹,黎浩江,刘立志,李立,徐硕瑀,郭圣文.多模态多维信息融合的鼻咽癌MR图像肿瘤深度分割方法[J].浙江大学学报（工学版）,2020,54(3):566-573. 被引量：9

共引文献121

1刘骥.数字化转型下职业与成人技能教育联通体建设的策略与路径[J].中国教育政策评论,2022(1):151-170.
2郑兆芳,吴成林,刘佳龙.大数据视角下医学影像技术的发展与职业健康探究[J].吉林劳动保护,2020(5):29-30. 被引量：2
3董默,赵若晗,周志尊,于广浩,陈广新,吴丹,彭钰欣.CAD技术在医学信息化发展中的应用介绍[J].软件,2018,39(4):33-35. 被引量：4
4高宏建,白燕萍,王笑茹,吴水才.生物医学工程专业实践综合训练平台的建设和教学探索[J].医疗卫生装备,2019,40(3):89-92. 被引量：9
5邱甲军,吴跃,惠孛,刘彦伯.肝细胞癌MR图像的纹理分类研究[J].电子科技大学学报,2019,48(4):619-626. 被引量：3
6徐一舫,卓一瑶,孙海洋,杨冠男.基于Faster R-CNN的乳腺肿块辅助检测[J].电脑知识与技术,2019,15(6X):217-220. 被引量：4
7罗卫,徐劲,许灿龙,蔡瑞康.一种适于基层部队应用的简易皮肤病检查装置[J].中国医疗设备,2019,34(8):67-68.
8王婧璇,林岚,赵思远,邬雪涛(综述),吴水才(审校).基于深度学习的肺结节计算机断层扫描影像检测与分类的研究进展[J].生物医学工程学杂志,2019,36(4):670-676. 被引量：17
9高唤,李秀娟.基于深度学习的肺结节诊断识别研究[J].软件导刊,2019,18(9):47-50. 被引量：6
10赵若晗,高杨,苏奎,于广浩,李永生,董默.医学影像计算机辅助分析系统的设计与应用[J].软件,2019,40(10):68-72. 被引量：1

同被引文献9

1黄扬林,胡凯,郭建强,彭诚.基于多尺度特征融合和双重注意力机制的肝脏CT图像分割[J].计算机科学,2022,49(S02):549-557. 被引量：2
2韩阳,宋金淼,薛安懿,段晓东.基于三重注意力的脑肿瘤图像分割网络[J].中国生物医学工程学报,2022,41(1):57-63. 被引量：6
3张文凯,刘文杰,孙显,许光銮,付琨.多源特征自适应融合网络的高分遥感影像语义分割[J].中国图象图形学报,2022,27(8):2516-2526. 被引量：4
4刘从军,徐佳陈,肖志勇,柴志雷.基于深度学习的心脏核磁共振图像自动分割算法[J].计算机工程与科学,2022,44(9):1646-1654. 被引量：6
5杨鹤,柏正尧.CoT-TransUNet:轻量化的上下文Transformer医学图像分割网络[J].计算机工程与应用,2023,59(3):218-225. 被引量：9
6刘苏毅,迟剑宁,吴成东,徐方.基于递归切片网络的三维点云语义分割与实例分割[J].中国图象图形学报,2023,28(7):2135-2150. 被引量：3
7赵什陆,张强.深度学习多模态图像语义分割前沿进展[J].中国图象图形学报,2023,28(11):3320-3341. 被引量：2
8严毅,邓超,李琳,朱凌坤,叶彪.深度学习背景下的图像语义分割方法综述[J].中国图象图形学报,2023,28(11):3342-3362. 被引量：5
9Jie ZHOU,Pei KE,Xipeng QIU,Minlie HUANG,Junping ZHANG.ChatGPT: potential, prospects, and limitations[J].Frontiers of Information Technology & Electronic Engineering,2024,25(1):6-11. 被引量：20

引证文献3

1徐佳陈,龙翔.一种基于深度学习的2.5D心脏分割算法[J].无线互联科技,2024,21(10):25-27.
2王淼,黄智忠,何晖光,卢湖川,单洪明,张军平.分割一切模型SAM的潜力与展望:综述[J].中国图象图形学报,2024,29(6):1479-1509.
3杨萍,陈立伟,王庆凤,周莹.融合卷积和Transformer的腹部多器官分割网络[J].计算机技术与发展,2024,34(9):47-54.

1申晓俊,陈铟铟,恽虹,赵士海,曾蒙苏,金航.单倍剂量钆布醇与双倍剂量钆喷酸葡胺在心力衰竭患者心脏MR增强评估心肌纤维化与心肌梗死中的对比研究[J].放射学实践,2023,38(11):1380-1384.
2黄宗顺,张文浩,董旻昱,黄泳璋.慢性肾衰竭合并胆道出血一例[J].临床内科杂志,2024,41(1):62-63.
3沈孝春,刘玲,孙文杰,刘宜鑫,刘霞,张梦瑜.睾丸青春期后型畸胎瘤1例[J].实用放射学杂志,2024,40(1):171-172.
4马丽娜,魏长春,贺利,栗师,乔龙虎.CT结合血清肿瘤标志物鉴别肝硬化结节与小肝癌的作用[J].现代科学仪器,2024,41(1):125-129.
5钟舒婷,张哲,刘欣华.飞秒激光角膜弧形切开术矫正白内障合并角膜散光的研究进展[J].中华眼科杂志,2024,60(2):200-205.
6张洺著,魏余贤.腹部皮瓣乳房重建的临床应用与进展[J].中文科技期刊数据库（引文版）医药卫生,2024(2):0005-0008.
7Moemi Matsuo,Takashi Higuchi,Takuya Ishibashi,Ayano Egashira,Toranosuke Abe,Hiroya Miyabara.Enhancement of Visual Attention by Color Revealed Using Electroencephalography[J].Open Journal of Therapy and Rehabilitation,2024,12(1):1-9.
8朱德胜,张军平,徐旻,吴海啸,王斌,范文涛,黄汀.单通道经皮肾镜联合输尿管软镜治疗合并肾盏颈口狭窄的复杂肾结石疗效观察[J].浙江医学,2024,46(4):408-411.

中国图象图形学报

2024年第1期

浏览历史

内容加载中请稍等...

采用多尺度视觉注意力分割腹部CT和心脏MR图像被引量：3

参考文献2

二级参考文献15

共引文献121

同被引文献9

引证文献3

相关作者

相关机构

相关主题

浏览历史

采用多尺度视觉注意力分割腹部CT和心脏MR图像 被引量：3

参考文献2

二级参考文献15

共引文献121

同被引文献9

引证文献3

相关作者

相关机构

相关主题

浏览历史

采用多尺度视觉注意力分割腹部CT和心脏MR图像被引量：3