一种基于子空间学习的图像语义哈希索引方法被引量：8

Semantic Hashing with Image Subspace Learning

下载PDF

导出

摘要随着数据量的不断增加,快速而准确的索引算法对信息检索而言变得十分重要.针对上述问题,提出了一种基于子空间学习的索引算法.首先,利用部分有标签的数据进行子空间学习,在学习过程中,为了保证语义相同的样本在索引后保持局部性,以样本近邻间的距离衡量类内聚合度;同时,为了保证不同语义的样本在索引后增强判别性,以不同语义样本中心之间的距离衡量类间离散度.通过放松限制,用类似线性判别分析的方法进行子空间学习,将子空间作为哈希函数的投影向量.利用学习到的投影向量进一步计算偏移量,得到哈希函数.分别在数据集MNIST和CIFAR-10上进行编码判别性实验和局部性保留实验,并与相关方法进行比较,得到了较好的效果.实验结果表明该方法是有效的. With the increasing amount of data being collected, developing fast indexing methods with high accuracy becomes important for information retrieval tasks. To address this issue, this paper proposes an indexing method based on hashing mechanism with subspace learning. Firstly, the subspace is learned on a set of labeled data. To guarantee the locality preserving characteristics in the original space for the samples with similar semantic labels, the distances between the nearest neighbors are computed to measure the intra-class scatter. Besides, the distances between the centers of samples with dissimilar semantic labels are also computed to measure the inter-class scatter in order to enhance the discriminative power of the codes. The projections of the hash functions are then learned by relaxing the constraint of the formula. The biases are further learned based on the projections. Finally, the proposed method is evaluated on the datasets MNIST and CIFAR-10 to compare with the state-of-the-art methods. Experimental results show that the proposed method achieves significant performance and high effectiveness in searching semantically similar neighbors.

作者毛晓蛟杨育彬

机构地区计算机软件新技术国家重点实验室(南京大学)

出处《软件学报》 EI CSCD 北大核心 2014年第8期1781-1793,共13页 Journal of Software

基金国家自然科学基金(61273257 61321491 61035003) 国家重点基础研究发展计划(973)(2010CB327903) 教育部新世纪优秀人才计划(NCET-11-0213) 江苏省六大人才高峰计划(2013-XXRJ-018) 江苏省自然科学基金(BK2011005)

关键词哈希函数子空间偏移量局部性保留判别性 hash function subspace bias locality preserving discriminant

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献24

1http://venturebeat.com/2008/07/25/google-finds-that-the-web-has-over-1-trillion-unique-urls.
2/http://www.kullin.net/2010/09/flickr-5-billion-photos/.
3Arya S, Mount DM. Approximate nearest neighbor queries in fixed dimensions. In: Proc. of the 4th Annual ACM/SIGACT-SIAM Symp. on Discrete Algorithms. New York: ACM/SIAM, 1993. 271-280.
4Gionis A,Indyk P, Motwani R. Similarity search in high dimensions via hashing, In: Proc. of the 25th Int'l Conf. on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 1999.518-529.
5Weiss Y, Torralba A, Fergus R. Spectral hashing. In: Proc. of the 22th Annual Conf. on Neural Information Processing System, New York: Curran Associates Inc., 2008. 1753-1760.
6Torralba A, Fergus R, Freeman WT. 80 million tiny images: A large dataset for non-parametric object and scene recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2008,30(11):1958-1970. [doi: 10,1109/TPAMI,2008.128].
7Torralba A, Fergus R, Weiss Y. Small codes and large databases for recognition, In: Proc, of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Washington: IEEE Computer Society, 2008. 1-8. [doi: 10,1109/CVPR.2008.4587633].
8Kulis B, Jain P, Grauman K. Fast similarity search for learned metric. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009,31(12):2143-2157. [doi: 10.1109/TPAMI.2009,151].
9Xu H, Wang JD, Li Z, Zeng G, Li SP, Yu NH, Complementary hashing for approximate nearest neighbor search. In: Proc. of the IEEE Int'l Conf. on Computer Vision, New York: IEEE, 2011. 1631-1638. [doi: 10.1109/ICCV.2011.6126424].
10Strecha C, Bronstein AM, Bronstein MM, Fua P. LDAHash: Improving matching with smaller descriptors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2012,34(1):66-78. [doi: 10.1109/TPAMI.2011.103].

同被引文献119

1冯兵,李芝棠,花广路.基于灰度—梯度共生矩阵的图像型垃圾邮件识别方法[J].通信学报,2013,34(S2):1-4. 被引量：10
2林海卓,王继龙,吴建平,杨家海,徐聪.高校误判垃圾邮件自动召回系统的研究与实现[J].通信学报,2013,34(S2):121-132. 被引量：1
3王建,周源华.一种基于纹理能量的JPEG图像文本定位算法[J].上海交通大学学报,2004,38(9):1492-1495. 被引量：4
4章东平,祝金标,刘济林.自动定位彩色图像中的文本[J].浙江大学学报（工学版）,2005,39(2):229-233. 被引量：5
5潘梅森,荣秋生.基于SOFM神经网络的图像融合二值化方法[J].光学精密工程,2007,15(3):401-406. 被引量：19
6袁培森,皮德常.用于内存数据库的Hash索引的设计与实现[J].计算机工程,2007,33(18):69-71. 被引量：21
7LOWE D G. Distinctive image features from scale-invar- iant keypoints [ J ]. International Journal of Computer Vision, 2004, 60(2) : 91-110.
8KE Y, SUKTHANKAR R. PCA-SIFT: a more distinc- tive representation for local image descriptors[ C ]. Pro- ceedings of the International Conference on Computer Vision and Pattern Recognition. Washington DC, USA: IEEE, 2004: 506-513.
9BAY H, ESS A, TUYTELAARS T, et al. SURF: spee- ded up robust features [ J ]. Computer Vision and Image Understanding, 2008, 110 (3) : 346-359.
10MIKOLAJCZYK K, SCHMID C. A performance evalua- tion of local descriptors [ C ]. Proceedings of the Interna- tional Conference on Computer Vision and Pattern Rec- ognition, Madison, USA: IEEE, 2003: 17-122.

引证文献8

1白丰,张明路,张小俊,孙凌宇.局部二进制特征描述算法综述[J].电子测量与仪器学报,2016,30(2):165-178. 被引量：12
2曹玉东,刘艳洋,贾旭,王冬霞.基于改进的局部敏感哈希算法实现图像型垃圾邮件过滤[J].计算机应用研究,2016,33(6):1693-1696. 被引量：13
3杜刚,曹玉东,刘艳洋.图像中的文本区域识别技术研究[J].辽宁工业大学学报（自然科学版）,2016,36(3):141-143.
4白琮,黄玲,陈佳楠,潘翔,陈胜勇.面向大规模图像分类的深度卷积神经网络优化[J].软件学报,2018,29(4):1029-1038. 被引量：62
5王粲.基于hashing的二值加速[J].电子制作,2018,26(20):39-40.
6陈凤,蒙祖强.基于哈希算法的异构多模态数据检索研究[J].计算机科学,2019,46(10):49-54. 被引量：11
7黄小燕,孙彬,杨展源,朱映映,田奇.面向视觉搜索的空间局部敏感哈希方法[J].中国图象图形学报,2021,26(7):1568-1582. 被引量：4
8王永欣,田洁茹,陈振铎,罗昕,许信顺.基于标记增强的离散跨模态哈希方法[J].软件学报,2023,34(7):3438-3450. 被引量：2

二级引证文献104

1崔建国,孙长库,李玉鹏,付鲁华,王鹏.基于SURF的快速图像匹配改进算法[J].仪器仪表学报,2022,43(8):47-53. 被引量：13
2明勇,甘晓敏,杨帆.基于时空域及高阶矩的红外弱目标检测算法[J].国外电子测量技术,2021,40(12):1-6. 被引量：2
3牛华,孙萍,周锡鹏,许金波.RT-PCR法观察光化学法损伤VSV核酸的动态变化[J].中国输血杂志,2000,13(1):5-7. 被引量：1
4马力,王致,张丹,洪永健,王天安.基于深度学习的人脸识别技术在电力巡检机器人中的应用研究[J].自动化与仪器仪表,2019(2):36-38. 被引量：3
5刘金海,付明芮,唐建华.基于漏磁内检测的缺陷识别方法[J].仪器仪表学报,2016,37(11):2572-2581. 被引量：27
6邢慧芬,吴其林,曹骞.基于人类视觉模型和Contourlet变换的图像感知哈希算法[J].阜阳师范学院学报（自然科学版）,2016,33(4):62-66. 被引量：1
7卢宇,卢荣胜.数字散斑相关方法全场变形快速测量[J].电子测量与仪器学报,2016,30(12):1828-1837. 被引量：2
8李猛,刘元宁.一种基于信息增益的新垃圾邮件特征选择算法[J].吉林大学学报（理学版）,2017,55(2):379-382. 被引量：2
9陈志轩,周大可,黄经纬.基于卷积神经网络的表情不变三维人脸识别[J].电子测量技术,2017,40(4):157-161. 被引量：12
10崔雪红,刘云,王传旭,李辉.基于卷积神经网络的轮胎缺陷X光图像分类[J].电子测量技术,2017,40(5):168-173. 被引量：17

1文辉,王明文,吴水秀,万剑怡.基于Markov网络及laplacian映射的快速相似性检索方法[J].计算机应用与软件,2012,29(8):37-40. 被引量：1
2李鸣,张鸿.基于卷积神经网络迭代优化的图像分类算法[J].计算机工程与设计,2017,38(1):198-202. 被引量：18
3王玉红,王东.查询请求的语义扩展研究[J].福建电脑,2009,25(9):36-37.
4孙浩军,高玉龙,闪光辉,袁婷.基于熵权法的混合属性聚类算法[J].汕头大学学报（自然科学版）,2013,28(4):58-65. 被引量：5
5王象刚.基于K均值随机森林快速算法及入侵检测中的应用[J].科技通报,2013,29(8):76-78. 被引量：2
6吕刚,郝平,盛建荣.一种改进的深度神经网络在小图像分类中的应用研究[J].计算机应用与软件,2014,31(4):182-184. 被引量：23
7高强,马艳梅.深度信念网络(DBN)网络层次数量的研究及应用[J].科学技术与工程,2016,16(23):234-238. 被引量：12
8何文垒,刘功申.基于语义密度的名词消歧算法[J].计算机科学,2012,39(6):194-197. 被引量：2
9晁学鹏.一种基于K均值聚类的下采样算法[J].科技通报,2013,29(8):73-75. 被引量：3
10于明,陈冀川.基于自组织特征映射网络的纹理分类研究[J].河北工学院学报,1994,23(1):34-43.

软件学报

2014年第8期

浏览历史

内容加载中请稍等...

一种基于子空间学习的图像语义哈希索引方法被引量：8

参考文献24

同被引文献119

引证文献8

二级引证文献104

相关作者

相关机构

相关主题

浏览历史

一种基于子空间学习的图像语义哈希索引方法 被引量：8

参考文献24

同被引文献119

引证文献8

二级引证文献104

相关作者

相关机构

相关主题

浏览历史

一种基于子空间学习的图像语义哈希索引方法被引量：8