期刊文献+

面向近重复文本图像检索的三分支孪生网络 被引量:1

Near-duplicate document image retrieval based on three-stream convolutional Siamese network
下载PDF
导出
摘要 针对传统近重复文本图像检索方法需人工事先确定近重复文本图像之间存在的变换类型,易受到人主观性影响这一问题,提出一个面向近重复文本图像检索的三分支孪生网络,能自动学习图像之间存在的各种变换。该网络输入为三元组,包括查询图像、查询图像的近重复图像以及其非近重复图像,训练时采用三元损失使得查询图像和近重复图像之间的距离小于查询图像与非近重复图像之间的距离。提出的方法在两个数据集上的mAP(mean average precision)分别达到98.76%和96.50%,优于目前已有方法。 In the traditional near-duplicate document image retrieval methods,the variations among the near-duplicate document images had to be manually identified beforehand,which can be easily influenced by human subjectivity.To solve this problem,we propose a three-stream convolutional Siamese network orienting toward the near-duplicate text-image retrieval,which can automatically learn the variation types among the near-duplicate document images.The input to this network is a triplet,consisting of a query image,its near-duplicate image,and its non-near-duplicate image.Using the triplet loss,the distance between the query image and its near-duplicate image is guaranteed to be smaller than that between the query and its non-near-duplicate image.This approach achieves promising results with the mAP of 98.76% and 96.50% on two datasets,respectively,thereby greatly outperforming the state-of-the-art near-duplicate document image retrieval methods.
作者 许柏祥 刘丽 邱桃荣 XU Boxiang;LIU Li;QIU Taorong(School of Information Engineering,Nanchang University,Nanchang 330031,China)
出处 《智能系统学报》 CSCD 北大核心 2022年第3期515-522,共8页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金青年项目(61603256).
关键词 近重复文本图像 图像检索 三分支孪生网络 三元损失函数 图像变换 三元组 特征提取 鲁棒性 near-duplicate document image image retrieval three-stream convolutional Siamese network triplet loss image variations triplet feature extraction robustness
  • 相关文献

参考文献1

二级参考文献5

共引文献7

同被引文献5

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部