摘要
多模态医学图像能够为医疗诊断、治疗规划和手术导航等临床应用提供更为全面和准确的医学图像描述。由于疾病的类型多样且复杂,无法通过单一模态的医学图像进行疾病类型诊断和病灶定位,而多模态医学图像融合方法可以解决这一问题。融合方法获得的融合图像具有更丰富全面的信息,可以辅助医学影像更好地服务于临床应用。为了对医学图像融合方法的现状进行全面研究,本文对近年国内外发表的相关文献进行综述。对医学图像融合技术进行分类,将融合方法分为传统方法和深度学习方法两类并总结其优缺点。结合多模态医学图像成像原理和各类疾病的图像表征,分析不同部位、不同疾病的融合方法的相关技术并进行定性比较。总结现有多模态医学图像数据库,并按分类对25项常见的医学图像融合质量评价指标进行概述。总结22种基于传统方法和深度学习领域的多模态医学图像融合算法。此外,本文进行实验,比较基于深度学习与传统的医学图像融合方法的性能,通过对3组多模态医学图像融合结果的定性和定量分析,总结各技术领域医学图像融合算法的优缺点。最后,对医学图像融合技术的现状、重难点和未来展望进行讨论。
Multimodal medical-fused images are essential to more comprehensive and accurate medical image descriptions for various clinical applications like medical diagnosis, treatment planning, and surgical navigation. However, single-modal medical images is challenged to deal with diagnose disease types and localize lesions due to its variety and complexity of disease types.As a result, multimodal medical image fusion methods are focused on obtaining medical images with rich information in clinical applications. Medical-based imaging techniques are mainly segmented into electromagnetic energy-based and acoustic energy-based. To achieve the effect of real-time imaging and provide dynamic images, the latter one uses the multiple propagation speed of ultrasound in different media. Current medical image fusion techniques are mainly concerned of static images in terms of electromagnetic energy imaging techniques. For example, it is related to some key issues like X-ray computed tomography imaging, single photon emission computed tomography, positron emission tomography and magnetic resonance imaging. We review recent literature-relevant based on the current status of medical image fusion methods. Our critical analysis can divide current medical image fusion techniques into two categories: 1) traditional methods and 2) deep learning methods. Nowadays, spatial domain and frequency domain-based algorithms are very proactive for traditional medical image fusion methods. The spatial domain techniques are implemented for the evaluation of image element values via prior pixel-level strategies, and the images-fused can realize less spatial distortion and a lower signal-to-noise ratio. The spatial domain-based methods are included some key aspects like 1) simple min/max, 2) independent component analysis, 3) principal component analysis, 4) weighted average, 5) simple average, 6) fuzzy logic, and 7) cloud model. The fusion process of spatial domain-based methods is quite simple, and its algorithm complexity can lower the computation cost. It also has a relatively good performance in alleviating the spectral distortion of fused images. However, the challenging issue is called for their fusion results better in terms of clarity, contrast and continuous lower spatial resolution. In the frequency domain, the input image is first converted from the null domain to the frequency domain via Fourier transform computation, and the fusion algorithm is then applied to the image-converted to obtain the final fused image, followed by the inversed Fourier transform. The commonly-used fusion algorithms in the frequency domain are composed of 1) pyramid transform, 2) wavelet transform and 3) multi-scale geometric transform fusion algorithms. This multi-level decomposition based methods can enhance the detail retention of the fused image. The output fusion results contain high spatial resolution and high quality spectral components. However, this type of algorithm is derived from a fine-grained fusion rule design as well. The deep learning-based methods are mainly related to convolutional neural networks(CNN) and generative adversarial networks(GAN), which can avoid fine-grained fusion rule design, reduce the manual involvement in the process, and their stronger feature extraction capability enables their fusion results to retain more source image information. The CNN can be used to process the spatial and structural information effectively in the neighborhood of the input image. It consists of a series of convolutional layers, pooling layers and fully connected layers. The convolution layer and pooling layer can extract the features in the source image, and the fully connected layer can complete the mapping from the features to the final output. In CNN, image fusion is regarded as a classification problem, corresponding to the process of feature extraction, feature option and output prediction. The fusion task is targeted on image transformation, activity level measurement and fusion rule design as well. Different from CNN, GAN network can be used to model saliency information in medical images through adversarial learning mechanism. GAN is a generative model with two multilayer networks, the first network mentioned is a generator-used to generate pseudo data, and the second following network is a discriminator-used to classify images into real data and pseudo data. The back-propagation-based training mode can improve the ability of GAN to distinguish between real data and generated data. Although GAN is not as widely used in multi-model medical image fusion(MMIF) as CNN, it has the potential for in-depth research. A completed overview of existing multimodal medical image databases and fusion quality evaluation metrics is developed further. Four open-source freely accessible medical image databases are involved in, such as the open access series of imaging studies(OASIS) dataset, the cancer immunome atlas(TCIA) dataset, the whole brain atlas(AANLIB) dataset, and the Alzheimer’s disease neuroimaging initiative(ANDI) dataset. And, a gene database for green fluorescent protein and phase contrast images are included as well, called the John Innes centre(JIC) dataset. Our critical review is based on the summary of 25 commonly-used medical image fusion result evaluation indicators in four types of metrics: 1) information theory-based;2) image feature-based;3) image structural similarity-based and 4) human visual perception-based, as well as 22 fusion algorithms for medical image datasets in recent years. The pros and cons of the algorithms are analyzed in terms of the technical-based comparison, fusion modes and evaluation indexes of each algorithm. In addition, our review is carried out on a large number of experiments to compare the performance of deep learning-based and traditional medical image fusion methods. Source images of three modal pairs are tested qualitatively and quantitatively via 22 multimodal medical image fusion algorithms. For qualitative analysis, the brightness, contrast and distortion of the fused image are observed based on the human vision system. For quantitative-based analysis, 15 objective evaluation indexes are used. By analyzing the qualitative and quantitative results, some critical analyses are discussed based on the current situation, challenging issues and future direction of medical image fusion techniques. Both of the traditional and deep learning methods have promoted fusion performance to a certain extent. More medical image fusion methods with good fusion effect and high model robustness are illustrated in the context of the algorithm optimization and the enrichment of medical image data sets. And, the two technical fields will continue to be developed towards the common research trends of expanding the multi-facet and multi-case medical images, proposing effective indicators suitable for medical image fusion, and deepening the research scope of image fusion.
作者
黄渝萍
李伟生
Huang Yuping;Li Weisheng(Chongqing Key Laboratory of Image Cognition,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《中国图象图形学报》
CSCD
北大核心
2023年第1期118-143,共26页
Journal of Image and Graphics
基金
国家重点研发计划资助(2019YFE0110800,2016YFC1000307-3)
国家自然科学基金项目(61972060,U1713213,62176071,62027827)
重庆市自然科学基金项目(cstc2020jcyj-zdxmX0025,cstc2019cxcyljrc-td0270,cstc2019jcyj-cxttX0002)。