摘要
针对乳腺超声图像肿瘤区域形状大小差异大导致分割困难,卷积神经网络(convolutional neural networks,CNN)建模长距离依赖性和空间相关性方面存在局限性,视觉Transformer(vision Transformer,ViT)要求数据量巨大等问题,提出一种融合CNN和ViT的分割方法。使用改进的Swin Transformer模块和基于可形变卷积的CNN编码器模块分别提取全局特征和局部细节特征,设计使用交叉注意力机制融合这两种尺度的特征表示,训练过程采取二元交叉熵损失混合边界损失函数,有效提高分割精度。在两个公共数据集上的实验结果表明,与现有经典算法相比所提方法的分割结果有显著提升,dice系数提升3.8412%,验证所提方法的有效性和可行性。
A segmentation method that fuses CNN and ViT is proposed to address the problems of large differences in shape and size of tumor regions of breast ultrasound images that lead to difficulty in segmentation,limitations in long-range dependency and spatial correlation in convolutional neural network(CNN)modeling,and the huge amount of data required by vision Transformer(ViT).Global and local detail features were extracted using a modified Swin Transformer module and a CNN encoder module based on deformable convolution,respectively.The design uses a cross-attention mechanism to fuse the feature representations of the two scales,and the training process adopts a binary cross-entropy loss combined with a boundary loss function.This approach effectively improves the segmentation accuracy.Experimental results on two public datasets show that the segmentation findings of the proposed method have been significantly improved compared with those of the existing classical algorithms,with a 3.8412%improvement in the dice coefficient.This outcome verifies the effectiveness and feasibility of the proposed method.
作者
彭雨彤
梁凤梅
PENG Yutong;LIANG Fengmei(College of Electronic Information and Optical Engineering,Taiyuan University of Technology,Jinzhong 030600,China)
出处
《智能系统学报》
CSCD
北大核心
2024年第3期556-564,共9页
CAAI Transactions on Intelligent Systems
基金
山西省重点研发计划项目(202102030201012).