期刊文献+

基于特征融合和代价敏感学习的图像标注方法 被引量:1

Image annotation method based on feature fusion and cost-sensitive learning
下载PDF
导出
摘要 针对图像标注数据集中存在的标注对象比例不一致和标签分布不平衡问题,提出基于特征融合和代价敏感学习的图像标注方法。在卷积神经网络中加入特征融合层,改进VGG16原有的网络结构,特征融合层结合注意力机制,对网络中不同卷积层提取的多尺度特征进行选择性融合,提升对不同尺度对象的标注精度;将代价敏感学习融入损失函数对网络模型进行训练,提升网络的泛化性能。实验结果表明,该方法能提升图像标注的准确率,增加对低频标签的召回率。 To solve the problems of object scale inconsistency and category imbalance in image datasets,an image annotation method based on feature fusion and cost-sensitive learning was proposed.The feature fusion layer was added to the convolutional neural network to improve the original network structure of VGG16,and the attention mechanism was combined to selectively fuse the multi-scale features extracted from different convolutional layers in the network to improve the performance of objects of different scales.Cost-sensitive learning was incorporated into the loss function to train the network model to improve the genera-lization performance of the network.Experimental results show that the proposed method can improve the accuracy of image annotation and increase the recall rate of low-frequency labels.
作者 厍向阳 车子豪 董立红 SHE Xiang-yang;CHE Zi-hao;DONG Li-hong(College of Computer Science and Technology,Xi’an University of Science and Technology,Xi’an 710054,China)
出处 《计算机工程与设计》 北大核心 2021年第11期3114-3120,共7页 Computer Engineering and Design
基金 陕西省自然科学基础研究基金项目(2019JLM-11) 陕西省自然科学基金项目(2017JM6105)。
关键词 图像自动标注 深度学习 特征融合 卷积神经网络 代价敏感学习 automatic image annotation deep learning feature fusion convolutional neural network cost-sensitive learning
  • 相关文献

参考文献6

二级参考文献38

  • 1路晶,马少平.基于概念索引的图像自动标注[J].计算机研究与发展,2007,44(3):452-459. 被引量:10
  • 2LI Q, GU Y, QIAN X. LCMKL: latent-community and multl-kernel learning based image annotation [C] //Proceedings of the 22nd ACM International Confer- ence on Information & Knowledge Management. New York, USA: ACM, 2013: 1469-1472.
  • 3QIAN X, HUA X S, HOU X. Tag filtering based on similar compatible principle [C]//Proceedings of IEEE International Conference on Image Processing. Piscat- away, NJ, USA: IEEE, 2012: 2349-2352.
  • 4QIAN X, HUA X S, TANG Y Y, et al. Social image tagging with diverse semantics [J]. IEEE Transactions on Cybernetics, 2014, 44(12): 2493-2508.
  • 5NGIAM J, KHOSLA A, KIM M, et al. Multimodal deep learning [C] // Proceedings of the 28th Interna- tional Conference on Machine Learning. New York, USA: ACM, 2011: 689-696.
  • 6OUYANG W, CHU X, WANG X. Multi-source deep learning for human pose estimation [C] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2014: 2337-2344.
  • 7KIROS R, ZEMEL R, SALAKHUTDINOV R. Mul- timodal neural language models [J]. Journal of Ma- chine Learning Research, 2014, 32(1): 595-603.
  • 8SRIVASTAVA N, SALAKHUTDINOV R. Multi- modal learning with deep Boltzmann machines [C]// Proceedings of Advances in Neural Information Pro- cessing Systems. Cambridge, MA, USA: MIT, 2012: 2222-2230.
  • 9RASIWASIA N. A new approach to cross-modal mul- timedia retrieval [C]//Proceedings of the 18 th ACM International Conference on Multimedia. New York, USA: ACM, 2010: 251-260.
  • 10FENG F, WANG X, LI R. Cross-modal retrieval with correspondence autoencoder [C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York, USA: ACM, 2014: 7-16.

共引文献43

同被引文献7

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部