期刊文献+

基于超列注意力机制的京剧人物识别 被引量:3

Beijing Opera character recognition based on attention mechanism with HyperColumn
下载PDF
导出
摘要 为了克服京剧人物视觉特征提取的难点及满足京剧人物实时识别的需求,提出基于超列注意力机制的卷积神经网络(HCA-CNN)来实现面向京剧人物的细粒度特征提取和识别。该网络中用于关键区域定位的注意力机制借鉴了用于图像分割和细粒度定位的超列(HyperColumn)特征思想,通过超列集基于像素点的形式串联主干分类网络来形成多层叠加特征,从而更好地兼顾早期浅层空间特征与后期深度类别语义特征,并提高定位任务与主干网络分类任务的准确度。同时,该网络的主干网络采用轻量级的MobileNetV2,从而更好地满足视频应用场景下的实时性要求。此外,还创建了京剧人物(BJOR)数据集,并在此数据集上进行了相关消融实验。实验结果显示,HCA-CNN与传统细粒度循环注意力网络(RA-CNN)相比,除了在准确率(Accuracy)指标上提高了0.63个百分点以外,其内存使用量(Memory Usage)、参数量(Params)分别减少了162.84 MB、131.5 MB,乘加次数(Mult-Adds)、每秒浮点运算次数(FLOPs)分别减少了39885×106、51886×106。可见,针对京剧人物视觉特征提出的HCA-CNN能有效提高京剧人物识别的准确率和效率,满足实际应用的需求。 In order to overcome the difficulty of visual feature extraction and meet the real-time recognition demand of Beijing Opera characters,a Convolutional Neural Network based on HyperColumn Attention(HCA-CNN)was proposed to extract and recognize the fine-grained features of Beijing Opera characters.The idea of HyperColumn features used for image segmentation and fine-grained positioning were applied to the attention mechanism used for key area positioning in the network.The multi-layer superposition features was formed by concatenating the backbone classification network in the forms of pixel points through the HyperColumn set,so as to better take into account both the early shallow spatial features and the late depth category semantic features,and improve the accuracy of positioning task and backbone network classification task.At the same time,the lightweight MobileNetV2 was adopted as the backbone network of the network,which better met the real-time requirement of video application scenarios.In addition,the BeiJing Opera Role(BJOR)dataset was created and the ablation experiments were carried out on this dataset.Experimental results show that,compared with the traditional fine-grained Recurrent Attention Convolutional Neural Network(RA-CNN),HCA-CNN not only improves the accuracy index by 0.63 percentage points,but also reduces the Memory Usage and Params by 162.84 MB and 131.5 MB respectively,and reduces the times of multiplication and addition Mult-Adds and floating-point operations per second FLOPs by 39885×10~6 times and 51886×10~6 times respectively.It verifies that the proposed HCA-CNN can effectively improve the accuracy and efficiency of Beijing Opera character recognition,and can meet the requirements of practical applications.
作者 覃俊 罗一凡 帖军 郑禄 吕伟龙 QIN Jun;LUO Yifan;TIE Jun;ZHENG Lu;LYU Weilong(College of Computer Science,South-Central University for Nationalities,Wuhan Hubei 430074,China;Hubei Provincial Engineering Research Center for Intelligent Management of Manufacturing Enterprises(South-Central University for Nationalities),Wuhan Hubei 430074,China;School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing Jiangsu 210094,China)
出处 《计算机应用》 CSCD 北大核心 2021年第4期1027-1034,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(61902437) 湖北省技术创新专项重大项目(2019ABA101)。
关键词 超列 注意力机制 递归网络 细粒度 京剧人物识别 HyperColumn attention mechanism recurrent network fine-grained Beijing Opera character recognition
  • 相关文献

参考文献8

二级参考文献21

共引文献35

同被引文献31

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部