摘要
多模态医学语料库是医学研究、临床诊断和教学的重要工具之一。然而,现有的医学语料库大多仅有文本数据,缺乏匹配的直观图像,信息不够充分。而大量医学图像缺少明确的语义标签,导致构建语料库困难。针对上述问题,该文提出一种面向多模态医学语料库的皮肤镜图像分类方法,通过对皮肤镜图像进行精确分类获取语义标签,并结合自然语言处理方法匹配相关文本信息,从而建立图像与文本相结合的多模态语料库。首先,针对传统机器学习图像分类方法对病灶特征提取较弱且易受背景噪声影响,导致病灶分类精度差的问题,该文构建双流网络,通过融合病灶的形状与纹理特征增强病灶特征提取能力。其次,为减少特征融合导致的信息冗余,引入了基于通道注意力机制的特征筛选方法,关注关键特征并抑制噪声影响。此外,针对皮肤镜图像良恶性样本数量不均衡导致的模型优化困难问题,引入非对称损失函数,提升模型对样本不均衡的鲁棒性。在ISIC皮肤镜图像数据集上的实验结果表明,该文所提方法能够快速准确地分类皮肤镜图像,并将图像与病历文本进行精准匹配以构建多模态医学语料库。
Multi-modal corpus is one of the important tools for medical research,clinical diagnosis and teaching.Most of the existing medical corpus only have text data,lacking matching images,which leads to inadequate information.Additionally,a large number of medical images don’t have clear semantic labels,which makes them difficult to be used in corpus construction.To solve the above problems,a novel dermoscopic image classification method for multi-modal medical corpus construction is proposed.Through accurate classification of dermoscopic images,we obtained semantic labels,and then matching them with relevant text by natural language processing method.In this way,a multi-modal corpus with images and texts was established.First,traditional image classification methods based on machine learning are weak in feature extraction of lesions and susceptible to background noise,resulting in poor classification accuracy.Considering the above,in this paper,a two-stream network was constructed to enhance the feature representation by fusing lesions’shapes and texture features.Second,in order to reduce information redundancy caused by feature fusion,a feature selection method based on channel attention mechanism was introduced to focus on the key features and suppress the influence of noise.In addition,aiming at the problem of model optimization difficulty caused by the imbalance in the number of benign and malignant samples in dermatoscopic images,asymmetric loss function was introduced to improve the robustness of model.Experimental results on ISIC dermatoscopic image dataset proved that the proposed method can accurately and efficiently categorize dermatoscopic images and match images with corresponding medical records to construct a multi-modal medical corpus.
作者
韩泓丞
林玉萍
郭钦钵
张栋
许美凤
朱龙飞
李小棉
冯丽丽
岳婕
HAN Hongcheng;LIN Yuping;GUO Qinbo;ZHANG Dong;XU Meifeng;ZHU Longfei;LI Xiaomian;FENG Lili;YUE Jie(College of Artificial Intelligence,Xi’an Jiaotong University,Xi’an 710049,China;School of Foreign Studies,Xi’an Jiaotong University,Xi’an 710049,China;School of Automation Science and Technology,Xi’an Jiaotong University,Xi’an 710049,China;Department of Dermatology,Second Afflicated Hospital of Xi’an Jiaotong University,Xi’an 710049,China;Department of English Education,Jeonbuk National University,Jeonju-si 560759,South Korea;Department of Pediatrics,First Affiliated Hospital of Xi’an Jiaotong University,Xi’an 710061,China)
出处
《西北大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第3期377-386,共10页
Journal of Northwest University(Natural Science Edition)
基金
陕西省自然科学基础研究计划面上项目(2022JM-324)
陕西省社会科学基金项目(2021K014)。
关键词
多模态语料库
皮肤镜图像
图像分类
卷积神经网络
自然语言处理
multi-modal corpus
dermoscopic image
image classification
convolutional neural network
natual language processing