面向智能交互的图像识别技术综述与展望被引量：91

Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques

下载PDF

导出

摘要视觉在人与人交互以及人与自然界的交互过程中起到非常重要的作用,让终端设备具有智能的视觉识别和交互能力是人工智能和计算机技术的核心挑战和远大目标之一.可以看到,近年来视觉识别技术发展飞速,新的创新技术不断涌现,新的研究问题不断被提出,面向智能交互的应用呈现出一些新的动态,正在不断刷新人们对此领域的原有认识.从视觉识别、视觉描述和视觉问答3个角度对图像识别技术进行综述,对基于深度学习的图像识别以及场景分类技术进行了具体介绍,对视觉描述和问答技术的最新技术进行了分析和讨论,同时对面向移动终端和机器人的视觉识别和交互应用进行了介绍,最后对该领域的未来研究趋势进行了分析. Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering （VQA）. Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.

作者蒋树强闵巍庆王树徽

机构地区中国科学院智能信息处理重点实验室(中国科学院计算技术研究所)

出处《计算机研究与发展》 EI CSCD 北大核心 2016年第1期113-122,共10页 Journal of Computer Research and Development

基金国家自然科学基金重点项目(61532018) 国家自然科学基金优秀青年科学基金项目(61322212) 国家自然科学基金青年科学基金项目(61303160) 国家"九七三"重点基础研究发展计划基金项目(2012CB316400)~~

关键词图像识别智能的视觉识别智能交互视觉描述视觉问答深度学习 image recognition intelligent visual recognition intelligent interaction visual descriptionvisual question and answering （VQA） deep learning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献66

1Andreopoulos A, Tsotsos J K. 50 years of objectrecognition: Directions forward [J]. Computer Vision andImage Understanding, 2013,117(8) : 827-891.
2Russakovsky 0,Deng Jia, Su Hao,et al. ImageNet: Largescale visual recognition challenge [J]. International Journalof Computer Vision,2015,115(3) : 211-252.
3Zhou Bolei,Lapedriza A,Xiao Jianxiong,et al. Learningdeep features for scene recognition using Places database [C]//Proc of the 28th Annual Conf on Neural InformationProcessing Systems. Cambridge, MA: MIT Press, 2014:487-495.
4Xiao Jianxiong,Hays J, Ehinger K,et ai. Sun database:Large-scale scene recognition from abbey to zoo [C] //Proc ofthe IEEE Conf on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE, 2015 : 3485-3492.
5Krizhevsky A, Sutskever I, Hinton G E. ImageNetclassification with deep convolutional neural networks [C] //Proc of the 26th Annual Conf on Neural InformationProcessing Systems. Cambridge MA: MIT Press, 2012 :1097-1105.
6Yosinski J,Clune J, Bengio Y, et al. How transferablefeatures in deep neural networks [C] //Proc of the 28thAnnual Conf on Neural Information Processing Systems.Cambridge, MA: MIT Press, 2014 : 3320-3328.
7Zeiler M D, Fergus R. Visualizing and understandingconvolutional networks [C] //Proc of the 16th European Confon Computer Vision. Berlin: Springer, 2014? 297-312.
8Simonyan K, Zisserman A. Very deep convolutionalnetworks for large-scale image recognition [J], CoRR abs/1409.1556, 2014.
9Szegedy C,Liu Wei, Jia Yangqing,et al. Going deeper withconvolutions [C] //Proc of the IEEE Conf on ComputerVision and Pattern Recognition. Piscataway,NJ: IEEE,2015: 1-9.
10Donahue J, Jia Yangqing, Vinyals 0,et al. DeCAF : A deepconvolutional activation feature for generic visual recognition[C] //Proc of the 31st Int Conf on Machine Learning. NewYork: ACM, 2014: 647-655.

二级参考文献31

1Li L, Jiang S, Huang Q. Learning hierarchical seman- tic description via mixed-norm regularization for image understanding. IEEE Transactios on Multimedia, 2012, 14(5):1401-1413.
2Bo L, Ren X, Fox D. Unsupervised feature learning for RGB-D based object recognition. In Springer Tracts in Ad- vanced Robotics 88, Desai J P, Dudek G, Khatib O, Kunmr V (eds.), Springer, pp.387 402.
3Gupta S, Arbelez P, Girshick R, Nialik J. Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic seg- mentation. International Journal of Computer Vision, 2014. http://link.springer.com/article/10.1007/s11263-014- 0777-6#, Feb. 2015.
4Chai X, Li G, Lin Y, Xu Z, Tang Y, Chen X, Zhou M. Sign language recognition and translation with Kinect. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, April 2013.
5Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2):91-110.
6Johnson A E, Hebert M. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 1999, 21 (5) :433-449.
7Morisset B, Rusu R B, Sundaresan A, Hauser K, Agrawal M, Latombe J C, Beetz M. Leaving flatland: Toward real- time 3D navigation. In Proc. IEEE International Confer- ence on Robotics and Automation, May 2009, pp.3786- 3793.
8Hinterstoisser S, Holzer S, Cagniart C et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In Proe. IEEE International Con- ference on Computer Vision (ICCtO, Nov. 2011, pp.858- 865.
9Krizhevsky A, Sutskever I, Hinton G E. ImageNet classi- fication with deep convolutional neural networks: In Proc. Neural Information Processing Systems, Dec. 2012.
10Zhang Z, Zhou C, Xin B,Wang Y, Gao W. An interactive system of stereoscopic video conversion. In Proe. the 20th ACM International Conference on Multimedia, Oct. 29 Nov. 2, 2012, pp.149-158.

共引文献5

1刘从文,费树岷.基于骨骼特征和手部关联物体特征的人的姿态识别[J].工业控制计算机,2016,29(4):82-84.
2王永雄,曾艳,李璇,尹钟,张孙杰,刘丽.融合交互信息和能量特征的三维复杂人体行为识别[J].小型微型计算机系统,2018,39(8):1828-1834. 被引量：4
3罗丽红,刘春晓,柯灵.基于二进制多层GELM的异质人脸识别模型[J].现代电子技术,2018,41(23):38-43. 被引量：1
4吴皓,李文静,田国会,陈兆伟,杨勇.室内环境中人穿携物品归属关系自主学习框架[J].浙江大学学报（工学版）,2019,53(7):1315-1322. 被引量：1
5Hao Wu,Zhao-Wei Chen,Guo-Hui Tian,Qing Ma,Meng-Lin Jiao.Item Ownership Relationship Semantic Learning Strategy for Personalized Service Robot[J].International Journal of Automation and computing,2020,17(3):390-402. 被引量：4

同被引文献574

1何浩祥,王玮,黄磊.基于卷积神经网络和递归图的桥梁损伤智能识别[J].应用基础与工程科学学报,2020,28(4):966-980. 被引量：22
2黄鑫,许振浩,林鹏,刘斌,聂利超,刘彤晖,苏茂鑫.隧道突水突泥致灾构造识别方法及其工程应用[J].应用基础与工程科学学报,2020,28(1):103-122. 被引量：25
3刘佳,黄剑锋,尚进.去学水电厂计算机监控系统设计及应用[J].水电厂自动化,2020(1):10-12. 被引量：1
4王宏伟,张利民,姜建平,乔木,刘英男,于大洋.特高压站避雷器泄漏电流在线监测和分析系统[J].电瓷避雷器,2019(6):67-72. 被引量：23
5蒲天骄,乔骥,韩笑,张国宾,王新迎.人工智能技术在电力设备运维检修中的研究及应用[J].高电压技术,2020,46(2):369-383. 被引量：195
6王刘旺,周自强,林龙,韩嘉佳.人工智能在变电站运维管理中的应用综述[J].高电压技术,2020,46(1):1-13. 被引量：74
7彭媛媛,李世超,陈曼云,常丽华.透明矿物薄片鉴定的计算机检索方法[J].吉林大学学报（地球科学版）,2006,36(S1):238-240. 被引量：7
8张晞,顾歆,齐悦,贾强.输送带纵向撕裂差影法图像识别技术研究[J].煤炭科学技术,2008,36(11):85-88. 被引量：15
9任俊杰,蒋岚.电力系统继电保护压板图像识别系统[J].北京联合大学学报,2004,18(2):60-64. 被引量：22
10吕强,王晓龙,刘峰,夏凡.基于点云配准的室内移动机器人6自由度位姿估计[J].装甲兵工程学院学报,2013,27(4):51-56. 被引量：5

引证文献91

1宫福敏.露天煤矿大型结构件智能双丝焊修研究[J].工矿自动化,2021,47(S01):116-118.
2郭雄伟,冯俊优,赵睿,丁志刚.基于图像识别的胶带撕裂检测系统研究[J].煤炭工程,2021,53(S01):86-90. 被引量：1
3王铁君,王维兰.基于本体的唐卡图像标注方法[J].吉林大学学报（工学版）,2020,50(1):289-296. 被引量：2
4吴建宝,肖诗斌,王焕鹏.改进的神经网络算法在舰船目标识别上的应用[J].北京信息科技大学学报（自然科学版）,2019,34(3):94-98. 被引量：4
5袁欣.计算机图像识别的智能化处理技术瓶颈与突破[J].电脑编程技巧与维护,2016(8):83-84. 被引量：12
6肖辉,郏涛,薄非.广播电视台融合媒体跨屏智能识别互动技术研究[J].广播与电视技术,2017,44(1):25-30. 被引量：1
7武煜博.图像识别技术发展与应用[J].电子技术与软件工程,2017(4):86-86. 被引量：9
8曹永峰,赵燕君.基于GA-BP神经网络的计算机智能化图像识别技术探究[J].应用激光,2017,37(1):139-143. 被引量：25
9庞俊震,淮永建.基于移动端的月季花快速识别方法研究[J].计算机应用与软件,2017,34(8):36-41. 被引量：1
10周晔,张军平.基于多尺度深度学习的商品图像检索[J].计算机研究与发展,2017,54(8):1824-1832. 被引量：12

二级引证文献421

1李卉,何晶,程富强,王晓薇,詹炳光.基于LSTM模型的卫星电源系统异常检测方法[J].装甲兵工程学院学报,2019,33(3):90-96. 被引量：3
2姚红革,王诚,喻钧,白小军,李蔚.复杂卫星图像中的小目标船舶识别[J].遥感学报,2020,24(2):116-125. 被引量：12
3王志强,王先传.基于Boosting的深度学习图像分类算法设计[J].系统仿真技术,2021,17(2):109-113. 被引量：1
4吴倩.一种基于图像识别技术的智能选址方法与系统[J].计算机产品与流通,2020,9(7):119-119. 被引量：2
5余琼,张瑞,李德豪,员玉良,王至秋.基于残差块与注意力机制的果蔬自动识别方法[J].农业机械学报,2023,54(S02):214-222.
6刘云玲,魏艳辉,徐小伟,陈克.奶牛健康监测设备与技术研究及应用进展综述[J].农业机械学报,2023,54(S01):303-314.
7傅隆生,宋珍珍,Zhang Xin,李瑞,王东,崔永杰.深度学习方法在农业信息中的研究进展与应用现状[J].中国农业大学学报,2020,25(2):105-120. 被引量：49
8马真明.降低肉牛运输应激反应的技术措施[J].现代畜牧兽医,2022(12):90-92. 被引量：2
9邹大中,李勋,黄建钟.基于图像识别的交流充电桩误差检定方法研究[J].电子测量技术,2021,44(7):13-18. 被引量：3
10董睿.继电保护二次回路问题连带故障处理与预防[J].光源与照明,2023(5):198-200. 被引量：3

1ZHANG Guang-quan,RONG Mei,TANG Zhi-song.Study on conversion mechanism from software architectural description language to UML[J].通讯和计算机（中英文版）,2007,4(7):1-7.
2李长春,陈智,宁康琪.MPEG-7多媒体信息检索标准分析[J].数字技术与应用,2010,28(3):56-56.
3郭璠,蔡自兴,谢斌,唐琎.图像去雾技术研究综述与展望[J].计算机应用,2010,30(9):2417-2421. 被引量：112
4倪银堂,吕迪洋,王振豪.消防机器人的研究现状综述与展望[J].自动化应用,2017(2):28-29. 被引量：11
5谭伟,王小强.制造网格安全研究综述与展望[J].东莞理工学院学报,2009,16(3):37-42. 被引量：1
6胡子昂,王卫星,陆健强,石颖.视频图像去雾霾技术研究综述与展望[J].电子技术与软件工程,2015(22):94-95. 被引量：2
7祝秀萍,吴学毅,刘文峰.人脸识别综述与展望[J].计算机与信息技术,2008(4):53-56. 被引量：24
8钟龙保,吕文林.CAD／CAM技术综述与展望[J].机械科技,1989(3):4-6.
9段寿建,邓有林.Web技术发展综述与展望[J].计算机时代,2013(3):8-10. 被引量：12
10潘仁宇,孙长乐,熊伟,王海涛.虚拟装配环境中碰撞检测算法的研究综述与展望[J].计算机科学,2016,43(S2):136-139. 被引量：16

计算机研究与发展

2016年第1期

浏览历史

内容加载中请稍等...

面向智能交互的图像识别技术综述与展望被引量：91

参考文献66

二级参考文献31

共引文献5

同被引文献574

引证文献91

二级引证文献421

相关作者

相关机构

相关主题

浏览历史

面向智能交互的图像识别技术综述与展望 被引量：91

参考文献66

二级参考文献31

共引文献5

同被引文献574

引证文献91

二级引证文献421

相关作者

相关机构

相关主题

浏览历史

面向智能交互的图像识别技术综述与展望被引量：91