摘要
热贡唐卡壁画作为人类及国家级非物质文化遗产是藏族文化中独具特色的艺术形式,其画面不仅表现出了佛教本生故事,更体现了藏地的历史、地理、文化、科技等内容。然而,不具备热贡艺术专业知识的人们很难对其进行了解。因此提出了一种唐卡壁画元素的自动检测算法,用于推动唐卡壁画的传播。通过对YOLOX算法进行改进,提出了ECAMH-YOLOX模型对唐卡壁画图像进行检测。ECAMH-YOLOX模型是在YOLOX的基础上增加了高效通道注意力模块,在保持轻量化的同时获得更好的图像全局信息;同时为了更好地检测不同尺度的目标,在检测头模块增加了一个新的检测头,通过四个检测头对图像进行检测,以此来提高不同尺寸目标的检测结果;并使用SIoU损失函数计算回归损失以此来加快模型的收敛速度,提高模型效果。实验结果证明,ECAMH-YOLOX模型在所构建的唐卡壁画数据集上均不存在漏检错检的情况,而YOLOX算法存在对小目标的漏检现象,并且ECAMHYOLOX模型的mAP0.5:0.95达到了55.9%,比YOLOX算法提升了0.049。该模型在保持轻量化的同时,进一步提高了检测效果。也增加了人们了解热贡艺术的途径。
Regong Tangka and murals,as a distinctive art form in Tibetan culture and recognized as human and national-level intangible cultural heritage,not only depict the stories of Buddhist origins but also embody the history,geography,culture,and technology of the Tibetan region.However,people without specialized knowledge of Regong arts find it challenging to understand their significance.Therefore,an automatic detection algorithm for Tangka and mural elements is proposed to promote the dissemination of Tangka and murals.This study improves the YOLOX algorithm and introduces the ECAMH-YOLOX model for detecting Tangka mural images.The ECAMH-YOLOX model is an improvement of the YOLOX framework,incorporating an efficient channel attention module.This module allows the model to capture better global information from images while maintaining a lightweight design.Additionally,to improve the detection of objects at different scales,a new detection head is added in the detection head module,facilitating detection through four detection heads to enhance results for objects of various sizes.The SIoU loss function is employed to calculate regression loss,which accelerates model convergence and improves model effectiveness.Experimental results demonstrate that the ECAMH-YOLOX model exhibits no instances of missed or false detection on the constructed Tangka and mural dataset,while the YOLOX algorithm shows missed detection for small objects.Moreover,the ECAMH-YOLOX model achieves an mAP0.5:0.95 of 55.9%,a 0.049 improvement over the YOLOX algorithm.The proposed model not only maintains a lightweight structure but also improves detection performance.In addition,it provides a pathway for people to gain a deeper understanding of Regong arts.
作者
李洪运
张效娟
赵洋
彭春燕
LI Hongyun;ZHANG Xiaojuan;ZHAO Yang;PENG Chunyan(School of Computer Science,Qinghai Normal University,Xining 810016,China;State Key Laboratory of Tibetan Intelligent Information Processing and Application,Xining 810016,China;School of Computer and Information,Hefei University of Technology,Hefei 230002,China)
出处
《计算机工程与应用》
CSCD
北大核心
2024年第18期248-255,共8页
Computer Engineering and Applications
基金
青海省重点研发与成果转化项目(2021-GX-111)
国家自然科学基金(62262056)
国家重点研发计划重点专项(2020YFC1523300)
青海师范大学2023年大学生创新创业训练计划(qhnucxcy2023019)。