摘要
针对传统近重复文本图像检索方法需人工事先确定近重复文本图像之间存在的变换类型,易受到人主观性影响这一问题,提出一个面向近重复文本图像检索的三分支孪生网络,能自动学习图像之间存在的各种变换。该网络输入为三元组,包括查询图像、查询图像的近重复图像以及其非近重复图像,训练时采用三元损失使得查询图像和近重复图像之间的距离小于查询图像与非近重复图像之间的距离。提出的方法在两个数据集上的mAP(mean average precision)分别达到98.76%和96.50%,优于目前已有方法。
In the traditional near-duplicate document image retrieval methods,the variations among the near-duplicate document images had to be manually identified beforehand,which can be easily influenced by human subjectivity.To solve this problem,we propose a three-stream convolutional Siamese network orienting toward the near-duplicate text-image retrieval,which can automatically learn the variation types among the near-duplicate document images.The input to this network is a triplet,consisting of a query image,its near-duplicate image,and its non-near-duplicate image.Using the triplet loss,the distance between the query image and its near-duplicate image is guaranteed to be smaller than that between the query and its non-near-duplicate image.This approach achieves promising results with the mAP of 98.76% and 96.50% on two datasets,respectively,thereby greatly outperforming the state-of-the-art near-duplicate document image retrieval methods.
作者
许柏祥
刘丽
邱桃荣
XU Boxiang;LIU Li;QIU Taorong(School of Information Engineering,Nanchang University,Nanchang 330031,China)
出处
《智能系统学报》
CSCD
北大核心
2022年第3期515-522,共8页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金青年项目(61603256).
关键词
近重复文本图像
图像检索
三分支孪生网络
三元损失函数
图像变换
三元组
特征提取
鲁棒性
near-duplicate document image
image retrieval
three-stream convolutional Siamese network
triplet loss
image variations
triplet
feature extraction
robustness