期刊文献+

面向智能交互的图像识别技术综述与展望 被引量:91

Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques
下载PDF
导出
摘要 视觉在人与人交互以及人与自然界的交互过程中起到非常重要的作用,让终端设备具有智能的视觉识别和交互能力是人工智能和计算机技术的核心挑战和远大目标之一.可以看到,近年来视觉识别技术发展飞速,新的创新技术不断涌现,新的研究问题不断被提出,面向智能交互的应用呈现出一些新的动态,正在不断刷新人们对此领域的原有认识.从视觉识别、视觉描述和视觉问答3个角度对图像识别技术进行综述,对基于深度学习的图像识别以及场景分类技术进行了具体介绍,对视觉描述和问答技术的最新技术进行了分析和讨论,同时对面向移动终端和机器人的视觉识别和交互应用进行了介绍,最后对该领域的未来研究趋势进行了分析. Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering (VQA). Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.
出处 《计算机研究与发展》 EI CSCD 北大核心 2016年第1期113-122,共10页 Journal of Computer Research and Development
基金 国家自然科学基金重点项目(61532018) 国家自然科学基金优秀青年科学基金项目(61322212) 国家自然科学基金青年科学基金项目(61303160) 国家"九七三"重点基础研究发展计划基金项目(2012CB316400)~~
关键词 图像识别 智能的视觉识别 智能交互 视觉描述 视觉问答 深度学习 image recognition intelligent visual recognition intelligent interaction visual descriptionvisual question and answering (VQA) deep learning
  • 相关文献

参考文献66

  • 1Andreopoulos A, Tsotsos J K. 50 years of objectrecognition: Directions forward [J]. Computer Vision andImage Understanding, 2013,117(8) : 827-891.
  • 2Russakovsky 0,Deng Jia, Su Hao,et al. ImageNet: Largescale visual recognition challenge [J]. International Journalof Computer Vision,2015,115(3) : 211-252.
  • 3Zhou Bolei,Lapedriza A,Xiao Jianxiong,et al. Learningdeep features for scene recognition using Places database [C]//Proc of the 28th Annual Conf on Neural InformationProcessing Systems. Cambridge, MA: MIT Press, 2014:487-495.
  • 4Xiao Jianxiong,Hays J, Ehinger K,et ai. Sun database:Large-scale scene recognition from abbey to zoo [C] //Proc ofthe IEEE Conf on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE, 2015 : 3485-3492.
  • 5Krizhevsky A, Sutskever I, Hinton G E. ImageNetclassification with deep convolutional neural networks [C] //Proc of the 26th Annual Conf on Neural InformationProcessing Systems. Cambridge MA: MIT Press, 2012 :1097-1105.
  • 6Yosinski J,Clune J, Bengio Y, et al. How transferablefeatures in deep neural networks [C] //Proc of the 28thAnnual Conf on Neural Information Processing Systems.Cambridge, MA: MIT Press, 2014 : 3320-3328.
  • 7Zeiler M D, Fergus R. Visualizing and understandingconvolutional networks [C] //Proc of the 16th European Confon Computer Vision. Berlin: Springer, 2014? 297-312.
  • 8Simonyan K, Zisserman A. Very deep convolutionalnetworks for large-scale image recognition [J], CoRR abs/1409.1556, 2014.
  • 9Szegedy C,Liu Wei, Jia Yangqing,et al. Going deeper withconvolutions [C] //Proc of the IEEE Conf on ComputerVision and Pattern Recognition. Piscataway,NJ: IEEE,2015: 1-9.
  • 10Donahue J, Jia Yangqing, Vinyals 0,et al. DeCAF : A deepconvolutional activation feature for generic visual recognition[C] //Proc of the 31st Int Conf on Machine Learning. NewYork: ACM, 2014: 647-655.

二级参考文献31

  • 1Li L, Jiang S, Huang Q. Learning hierarchical seman- tic description via mixed-norm regularization for image understanding. IEEE Transactios on Multimedia, 2012, 14(5):1401-1413.
  • 2Bo L, Ren X, Fox D. Unsupervised feature learning for RGB-D based object recognition. In Springer Tracts in Ad- vanced Robotics 88, Desai J P, Dudek G, Khatib O, Kunmr V (eds.), Springer, pp.387 402.
  • 3Gupta S, Arbelez P, Girshick R, Nialik J. Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic seg- mentation. International Journal of Computer Vision, 2014. http://link.springer.com/article/10.1007/s11263-014- 0777-6#, Feb. 2015.
  • 4Chai X, Li G, Lin Y, Xu Z, Tang Y, Chen X, Zhou M. Sign language recognition and translation with Kinect. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, April 2013.
  • 5Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2):91-110.
  • 6Johnson A E, Hebert M. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 1999, 21 (5) :433-449.
  • 7Morisset B, Rusu R B, Sundaresan A, Hauser K, Agrawal M, Latombe J C, Beetz M. Leaving flatland: Toward real- time 3D navigation. In Proc. IEEE International Confer- ence on Robotics and Automation, May 2009, pp.3786- 3793.
  • 8Hinterstoisser S, Holzer S, Cagniart C et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In Proe. IEEE International Con- ference on Computer Vision (ICCtO, Nov. 2011, pp.858- 865.
  • 9Krizhevsky A, Sutskever I, Hinton G E. ImageNet classi- fication with deep convolutional neural networks: In Proc. Neural Information Processing Systems, Dec. 2012.
  • 10Zhang Z, Zhou C, Xin B,Wang Y, Gao W. An interactive system of stereoscopic video conversion. In Proe. the 20th ACM International Conference on Multimedia, Oct. 29 Nov. 2, 2012, pp.149-158.

共引文献5

同被引文献574

引证文献91

二级引证文献421

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部