双编码特征注意网络的手术器械分割

A dual-encoder feature attention network for surgical instrument segmentation

导出

摘要目的手术器械分割是外科手术机器人精准操作的关键环节之一,然而,受复杂因素的影响,精准的手术器械分割目前仍然面临着一定的挑战,如低对比度手术器械、复杂的手术环境、镜面反射以及手术器械的尺度和形状变化等,造成分割结果存在模糊边界和细节错分的问题,影响手术器械分割的精度。针对以上挑战,提出了一种新的手术器械分割网络,实现内窥镜图像中手术器械的准确分割。方法为了实现内窥镜图像的准确表征以获取有效的特征图,提出了基于卷积神经网络(convolutional neural network,CNN)和Transformer融合的双编码器结构,实现分割网络对细节特征和全局上下文语义信息的提取。为了实现局部特征图的特征增强,引入空洞卷积,设计了多尺度注意融合模块,以获取多尺度注意力特征图。针对手术器械分割面临的类不均衡问题,引入全局注意力模块,提高分割网络对手术器械区域的关注度,并减少对于无关特征的关注。结果为了有效验证本文模型的性能,使用两个公共手术器械分割数据集进行性能分析和测试。基于定性分析和定量分析通过消融实验和对比实验,验证了本文算法的有效性和优越性。实验结果表明:在Kvasir-instrument数据集上,本文算法的Dice分数和mIOU(mean intersection over union)值分别为96.46%和94.12%;在Endovis2017(2017 Endoscopic Vision Challenge)数据集上,本文算法的Dice分数和mIOU值分别为96.27%和92.55%。相较于对比的先进分割网络,本文算法实现了分割精度的有效提升。同时,消融研究也证明了本文算法方案设计的合理性,缺失任何一个子模块都会造成不同程度的精度损失。结论本文所提出的分割模型有效地融合了CNN和Transformer的优点,同时实现了细节特征和全局上下文信息的充分提取,可以实现手术器械准确、稳定分割。 Objective Medical instruments are recognized as indispensable tools to deal with surgerical tasks.Surgical trauma is still challenged to be optimized farther.The emerging surgical robots could shrink the harmful degree of derived of tsurgery operations,and it has higher stability and stronger learning ability in comparison with manual-based surgery.The precise segmentation of surgical instruments is a key link to the smooth operation of surgical robots.The existing seg⁃mentation methods can be used to locate the surgical instruments and segment the shape of the surgical instruments roughly.Due to these complex factors are required to be resolved in relevance to low contrast of surgical instruments,com⁃plex environment,mirror reflection,different sizes and shapes of surgical instruments,these segmentation methods are still challenged for a certain loss on boundary information and detailed features of surgical instruments,resulting in blurred boundaries and misclassification of details.To optimize its related surgical instrument segmentation,we develop a Trans⁃former and convolutional neural network(CNN)based dual-encoder fusion segmentation network in terms of endoscopic images-relevant surgical instrument segmentation.Method For the encoder-decoder framework,a dual-encoder fusion seg⁃mentation network is facilitated to construct an end-to-end surgical instrument segmentation scheme.To optimize weak fea⁃ture representation ability and get effective context features further,a Transformer and CNN fused dual-encoder block is built up to strengthen endoscopic images-related extraction ability of local details and global context information simultane⁃ously.In addition,effective multi-scale feature extraction is also essential for the improvement of segmentation accuracy since heterogeneous surgical instruments are existed in sizes and shapes.To extract multi-scale attention feature maps,a multi-scale attention fusion module is embedded into the bottleneck layer for feature enhancement of local feature maps.To resolve its class imbalance issue-related surgical instrument segmentation task,an attention gated block is also introduced into the decoder unit to integrate the segmentation network into the surgical instruments better,and the attention to irrel⁃evant features can be reduced as well.Result To verify the effectiveness and potentials of the dual-encoder fusion segmenta⁃tion network proposed,two sort of publicity datasets on surgical instrument segmentation are adopted,including cataract surgery dataset(Kvasir-instrument dataset)and gastrointestinal surgery dataset(Endovis2017 dataset).Combined with the qualitative analysis and quantitative analysis,the segmentation performance is tested based on three sorts of experi⁃ments in related to ablation,comparison and visualization.The proposed dual-encoder fusion segmentation network has obtained a good segmentation results on both two datasets,which could achieve 96.46%of Dice score and 94.12%of mean intersection over union(mIOU)value on the Kvasir-instrument dataset,and 96.27%of Dice score and 92.55%of mIOU value on the Endovis2017 dataset.Compared to other related state-of-the-art comparison methods,the Dice score is improved by 1.51%and the mIOU value is improved by 2.52%compared to progressive alternating attention network(PAANet)model on the Kvasir-instrument dataset,and the Dice score is improved by 1.62%and the mIOU is improved by 2.22%compared to refined attention segmentation network(RASNet)model on the Endovis2017 dataset.Furthermore,to verify the effectiveness of each sub-module,quantitative and qualitative analysis based ablation experiments are also car⁃ried out.The dual-encoder module can be verified to improve its segmentation accuracy for Kvasir-instrument dataset and Endovis2017 dataset as well.Conclusion To optimize surgical instrument segmentation task against such problems like mir⁃ror reflection,different shapes and size,and class imbalance,a CNN and Transformer based dual-encoder fusion segmen⁃tation network is developed to build up an end-to-end surgical instrument segmentation scheme.It is predicted that our method proposed can be used to segment the surgical instruments accurately based on endoscopic images in various shapes and sizes,which can provide a potential ability for robot-assisted surgery further.

作者杨磊谷玉格边桂彬刘艳红 Yang Lei;Gu Yuge;Bian Guibin;Liu Yanhong(School of Electrical and Information Engineering,Zhengzhou University,Zhengzhou 450001,China;Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)

机构地区郑州大学电气与信息工程学院中国科学院自动化研究所

出处《中国图象图形学报》 CSCD 北大核心 2023年第10期3214-3230,共17页 Journal of Image and Graphics

基金国家重点研发计划资助(2020YFB1313701) 国家自然科学基金项目(62003309)。

关键词深度学习手术器械分割卷积神经网络(CNN) TRANSFORMER 双编码特征注意机制 deep learning surgical instrument segmentation convolutional neural network(CNN) Transformer dual encoder feature attention mechanism

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1虎晓红,李炳军,刘芳.多颜色空间中玉米叶部病害图像图论分割方法[J].农业机械学报,2013,44(2):177-181. 被引量：20
2罗恺锴,王婷,叶芳芳.引入注意力机制和多视角融合的脑肿瘤MR图像U-Net分割模型[J].中国图象图形学报,2021,26(9):2208-2218. 被引量：12
3周涛,董雅丽,霍兵强,刘珊,马宗军.U-Net网络医学图像分割应用综述[J].中国图象图形学报,2021,26(9):2058-2077. 被引量：36

二级参考文献28

1于宁波,刘嘉男,高丽,孙泽文,韩建达.基于深度学习的膝关节MR图像自动分割方法[J].仪器仪表学报,2020(6):140-149. 被引量：30
2崔艳丽,程鹏飞,董晓志,刘志华,王双喜.温室植物病害的图像处理及特征值提取方法的研究——基于色度的特征值提取研究[J].农业工程学报,2005,21(z2):32-35. 被引量：39
3林晓燕,刘文耀,陈晓冬,曹茂永.杨树病害孢子的图像识别技术研究[J].仪器仪表学报,2003,24(z2):364-366. 被引量：10
4桑林琼,邱明国,王莉,张静娜,张晔.基于统计阈值的脑肿瘤MRI图像的分割方法[J].生物医学工程研究,2010,29(4):237-239. 被引量：11
5赵玉霞,王克如,白中英,李少昆,谢瑞芝,高世菊.基于图像识别的玉米叶部病害诊断研究[J].中国农业科学,2007,40(4):698-703. 被引量：43
6黄峰茜,陈春晓,吴文佳.粒子群优化算法在脑部肿瘤图像分割中的应用[J].河南科技大学学报（自然科学版）,2007,28(6):97-99. 被引量：4
7冀荣华,祁力钧,傅泽田.机器视觉技术在精细农业中的研究进展[J].农机化研究,2007,29(11):1-5. 被引量：18
8Al-Hiary H,Bani-Ahmad S,Reyalat M. Fast and accurate detection and classification of plant diseases[J].International Journal of Computer Applications,2011,(03):31-38.
9Sanjay B Patil,Shrikant Dr,Bodhe K. Leaf disease severity measurement using image processing[J].International Journal of Engineering and Technology,2011,(05):297-301.
10Tian Y,Lu M,Hampapur A. Robust and efficient foreground analysis forreal-time video surveillance[A].Washington,DC,USA,IEEE Press,2005.1182-1187.

共引文献64

1吴露露,马旭,齐龙,李泽华,郑志雄.基于叶片形态的田间植物检测方法[J].农业机械学报,2013,44(11):241-246. 被引量：7
2温芝元,曹乐平.椪柑果实病虫害的傅里叶频谱重分形图像识别[J].农业工程学报,2013,29(23):159-165. 被引量：19
3刘永波,雷波,曹艳,唐江云,胡亮.基于深度卷积神经网络的玉米病害识别[J].中国农学通报,2018,34(36):159-164. 被引量：16
4田杰,韩冬,胡秋霞,马孝义.基于PCA和高斯混合模型的小麦病害彩色图像分割[J].农业机械学报,2014,45(7):267-271. 被引量：23
5霍迎秋,秦仁波,邢彩燕,陈曦,方勇.基于CUDA的并行K-means聚类图像分割算法优化[J].农业机械学报,2014,45(11):47-53. 被引量：29
6王靖,李少华,谢守勇.基于机器视觉的目标识别方法研究[J].西南师范大学学报（自然科学版）,2015,40(6):130-133. 被引量：7
7董晓辉,尹飞.基于多颜色空间的麦田监控图像分割技术研究[J].农业工程技术（农业信息化）,2015(10):48-56. 被引量：1
8吴娜,李淼,袁媛,卞程飞,陈雷.基于混合颜色空间和双次Otsu的黄瓜靶斑病图像分割[J].中国农业大学学报,2016,21(3):125-130. 被引量：7
9陈松楠.图像识别技术对提升动物疫情监测预警能力的研究——以东莞市动物卫生监督所为例[J].中国农业信息,2016,28(16):38-41.
10雍歧卫,喻言家.基于无人机巡线图像的地面油气管道识别方法[J].兵器装备工程学报,2017,38(4):100-104. 被引量：12

1叶思佳,魏延,杜韩宇,邓金枝.结合注意力机制的HRNet图像语义分割算法[J].计算机与现代化,2023(10):65-69. 被引量：1
2陈典超,王晨.基于语义分析的恶意JavaScript检测技术[J].电子设计工程,2023,31(22):37-41.
3姜应凤.公立医院欠费管理问题及对策思考[J].现代营销（信息版）,2023(9):112-114.
4王玲,黄冠,王鹏,白燕娥,邱天衡.基于改进D2Det尺度自适应目标检测算法研究[J].计算机科学,2023,50(S02):174-182.
5刘卓睿,安义,蔡木良,熊健豪,刘蓓,韩星.用于配电台区柔直互联的固态开关控制研究[J].电子设计工程,2023,31(22):42-46. 被引量：1
6禹君丽,吴婧.对A企业人力资源薪酬管理中绩效考核的实践研究[J].知识经济,2023(31):88-90.
7康书铭,朱焱.基于话题注意力和依存句法信息的文本立场分析[J].计算机科学,2023,50(S02):52-56.
8梁珂,岳冲,周正龙,杭天柱.考虑级配影响的珊瑚砂最大动剪切模量试验研究[J].土木与环境工程学报（中英文）,2023,45(6):95-103. 被引量：1
9刘飞,贾超,沈才华,钱晋,王业钊.基于多缝开裂理论的纤维混凝土受拉本构[J].工业建筑,2023,53(S01):681-685. 被引量：2
10牛淑贞.近代中东铁路东线各区域的中心城镇[J].地域文化研究,2023(6):35-46.

中国图象图形学报

2023年第10期

浏览历史

内容加载中请稍等...

双编码特征注意网络的手术器械分割

参考文献3

二级参考文献28

共引文献64

相关作者

相关机构

相关主题

浏览历史