摘要
深入调研梳理了OpenAI CLIP多模态模型和相关应用研究。借助CLIP模型,通过大规模军事相关图片数据集开展实验,设计开发了具有以文搜图和以图搜图功能的多模态搜索工具,且在实际测试中表现良好,可以为后续围绕军事相关图片的事件分类、目标检测、任务轨迹跟踪等方面的分析研究奠定基础。
The Open AI′ multimodal model Contrastive Language-image Pre-training(CLIP) and relevant application research were thoroughly analyzed and sorted out. With the help of CLIP, a multimodal search tool with the functions of text search and image search was designed and developed through experiments with military-related image datasets on a large scale. The tool performed well in practical tests, which can lay a foundation for further analyses and research on event classification, target detection, task trajectory tracking of military-related images.
作者
赵晋巍
刘晓鹏
罗威
程瑾
毛彬
宋宇
ZHAO Jin-wei;LIU Xiao-peng;LUO Wei;CHENG Jin;MAO Bin;SONG Yu(Information Research Center of Military Sciences,Academy of Military Sciences,Beijing 100142,China)
出处
《中华医学图书情报杂志》
CAS
2022年第8期14-20,共7页
Chinese Journal of Medical Library and Information Science
关键词
CLIP模型
多模态
图文检索
以图搜图
CLIP model
Multimodal
Image-text retrieval
Reverse image search