摘要
在移动互联网、智能终端以及人工智能技术迅猛发展的大数据时代背景下,以图像、音视频、3D模型等跨模态多媒体数据为检索对象的移动视觉搜索成为当下的研究热点,如何通过跨模态知识协同实现视觉数据搜索成为当务之急。文章以跨模态数据搜索的主流技术——深度学习为主线,将跨模态数据搜索的系统框架、技术关键等研究现状划分为基于卷积/循环神经网络的方法、基于图网络表示的方法、基于生成对抗的方法以及基于深度哈希编码的方法进行归纳,并对研究现状中尚未解决的难点进行关注,对未来的发展态势进行展望,为纵深方向的探索提供理论依据。
Due to the rapid development of mobile Internet, smart terminals and artificial intelligence technology in the era of big data, mobile visual search, which takes cross-modal multimedia data such as images, audios, videos, and 3 D models as the retrieval objects, has become the research hotspot nowadays. How to realize visual data retrieval through cross-modal knowledge collaboration has become the top priority. This paper uses the mainstream technology of cross-modal data retrieval-deep learning as the guideline, divides the system framework and key technologies of cross-modal data retrieval into methods based on convolutional neural networks and recurrent neural networks, methods based on graph network representation, methods based on generative adversarial networks, and methods based on deep supervised hashing, and reviews the current research status. This paper also examines the unsolved difficulties in the current research status and the future development trend, which provide a theoretical basis for the in-depth exploration.
作者
朱维乔
Zhu Weiqiao(Guangzhou Maritime University,Guangzhou,Guangdong 510725,China)
出处
《高校图书馆工作》
2022年第5期41-45,共5页
Library Work in Colleges and Universities
基金
广东省哲学社会科学十三五规划项目“大数据环境下基于深度学习的移动视觉搜索机制构建研究”(GD18XTS04)研究成果之一。
关键词
深度学习
跨模态
视觉数据搜索
Deep learning
Cross-modal
Visual data retrieval