一种新的基于主分量排序的高维索引结构

New high-dimensional indexing structure based on principal component sorting

下载PDF

导出

摘要利用KL变换的能量集中特性,改进了向量近似方法中的索引结构。在KL变换域上建立近似向量,选择能量最大的分量作为主分量,根据主分量值对近似向量进行顺序排列,并且用B+树存储每个数据页面中主分量值的范围。在k近邻搜索过程中,采用变换域部分失真搜索算法,从初始访问数据页面开始在升序和降序两个方向上顺序访问近似向量。改进的索引结构既保持了顺序访问特性,又大幅度降低了数据页面访问数量。在大型高维图像特征库上的实验表明,新的索引结构不仅降低了搜索过程的I/O时间,而且提高了CPU搜索速度。 The vector approximation approach is an efficient high-dimensional indexing method to overcome the difficulty of ‘curse of dimensionality＇. A new high-dimensional indexing structure based on vector approximation method is introduced. The approximate vector is built at the KL transform domain, and the first component is chosen as the principal component. A sequence index is built based on the principal component, which uses B^＋ tree to manage the approximate vectors. In the k-nearest neighbor search, the partial distortion searching algorithm is used to reject the improper approximate vectors. Only a small set of approximate vectors are ac cessed during the search, which reduces the computational complexity and I/O cost. The experimental results on large image databases show that the new approach renders a higher search speed than the well-known VA^＋ -file approach.

作者崔江涛付少锋詹海生周利华

机构地区西安电子科技大学计算机学院

出处《系统工程与电子技术》 EI CSCD 北大核心 2006年第12期1927-1931,共5页 Systems Engineering and Electronics

基金 "十五"国防科技(电子)预研基金资助课题(413160501)

关键词高维索引向量近似近邻搜索 KL变换主分量排序 high-dimensional indexing vector approximation neighbor search KL transform principal component sorting

分类号 TP311.134.3 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献13

1Rui Y,Huang T S,Chang S F.Image retrieval:current techniques,promising directions,and open issues[J].Journal of Visual Communication and Image Representation,1999(10):36-92.
2Bohm C,Berchtold S,Keim D.Searching in high-dimensional spaces-index structures for Improving the performance of multimedia databases[J].ACM Computing Surveys,2001,33(3):322-373.
3Weber R,Schek H J,Blott S.A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces[C]∥ Proc.24^th Int.Conf.VLDB,New York,IEEE,1998:194-205.
4Indyk P,Motwani R.Approximate nearest neighbors:towards removing the curse of dimensionality[C]∥ Proc.30^th ACM Symposium on Theory of Computing,New York,ACM,1998:604-613.
5Ferhatosmanoglu H,Tuncel E,Agrawal D.Vector approximation based indexing for non-uniform high dimensional data sets[C]∥ Proc.of the ACM Int'1 Conf.on Information and Knowledge Management.New York:ACM,2000:202-209.
6叶航军,徐光祐.基于矢量量化的快速图像检索[J].软件学报,2004,15(5):712-719. 被引量：11
7Berchtold S,Bohm C,Jagadish H V,et al.Indepentent quantization:an index compression technique for high-dimensional data spaces[C]∥Proc.of IEEE Data Engineering,San Diego,2000:577-588.
8Cha G H,Chung C W.The GC-tree:a high-dimensional index structure for similarity search in image databases[J].IEEE Trans.on Multimedia,2002,4(2):235-247.
9Berchtold S,Bohm C,Keim D A,et al.On optimizing nearest neighbor queries in high-dimensional data spaces[C]∥Proc.Int.Conf.on Database Theory,London,in:Lecture Notes in Computer Science,Springer,2001,1973:435-449.
10Beyer K S,Goldstein J,Ramakrishnan R.When is 'nearest neighbor'meaningful?[C]∥ Proc.7^th Int.Conf.on Database Theory.1540:Lecture Notes in Computer Science,Springer,1999,1540:217-235.

二级参考文献18

1Rui Y, Huang TS, Chang SF. Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 1999,10(4):39-62.
2Rui Y, Huang TS, Ortega M, Mehrotra S. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. on Circuits and Systems for Video Technology, 1998,8(5):644-655.
3Nievergelt J, Hinterberger H, Sevcik K. The gridfile: An adaptable symmetric multikey file structure. ACM Trans. on Database Systems, 1984,9(1):38-71.
4Robinson J. The k-d-b-tree: A search structure for large multidimensional dynamic indexes. In: Edmund YL, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1981.10-18.
5Beckmann N, Kriegel HP, Schneider R, Seeger B. The R*-tree: An efficient and robust access method for points and rectangles. In:Garcia-Molina H, Jagadish HV, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1990.322～33
6Katayama N, Satoh S. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Peckham J, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM Press, 1997. 369～380.
7Weber R, Schek H, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th ACM Int'l Conf. on Very Large Data Bases (VLDB'98). New York: Morgan
8Beyer K, Goldstein J, Ramakrishnan R. When is ‘nearest neighbor' meaningful? In: Beeri C, Buneman P, eds. Proc. of the 7th ACM Int'l Conf. on Database Theory (ICDT'99). Lecture Notes in Computer Science 1540, Berlin: Springer-Verlag, 1999.217～235.
9Scott DW. Multivariate Density Estimation. New York: John Wiley and Sons, 1992.
10Gersho A, Gray RM. Vector Quantization and Signal Compression. Boston: Kluwer Academic Press, 1992.

共引文献10

1李静.基于内容的图像检索技术研究现状综述[J].科技风,2008(19):33-33.
2崔江涛,孙君顶,周利华.基于小波变换的多分辨率高维图像检索方法[J].西安电子科技大学学报,2005,32(3):370-373. 被引量：1
3胡云,孙志挥.基于分块k-主色矢量量化的图像检索方法[J].淮海工学院学报（自然科学版）,2007,16(1):31-34. 被引量：1
4张骊峰,章鲁.医学影像数据库的索引及检索技术的研究[J].国际生物医学工程杂志,2007,30(3):159-163. 被引量：3
5丁广太,王威.图像相似性计算的级联小波变换[J].上海大学学报（自然科学版）,2007,13(5):571-577.
6崔江涛,肖斌,詹海生.面向高维数据集的近邻顺序查询方法[J].计算机科学与探索,2010,4(9):840-849.
7张孝,孙新云,刘科研,琚星星,王珊.myBUD中多媒体数据索引CFTree的研究和实现[J].计算机研究与发展,2011,48(10):1935-1941.
8张彦伪,王化雨,王吉华.基于层次分析法的广义M-J集颜色特征向量描述[J].计算机应用与软件,2011,28(11):199-201.
9吴纯青,任沛阁,王小峰.基于语义的网络大数据组织与搜索[J].计算机学报,2015,38(1):1-17. 被引量：29
10杜丹蕾,罗恩韬,唐雅媛,李延浚.面向图像检索的累加乘积量化方法研究[J].计算机工程,2015,41(10):226-231. 被引量：1

1崔江涛,郭勇,周水生.一种基于椭圆体聚类的高维索引方法[J].模式识别与人工智能,2010,23(4):483-490. 被引量：1
2崔江涛,周水生,周利华.基于Hadamard变换的高维图像检索方法[J].计算机科学,2006,33(3):212-214.
3崔江涛,孙君顶,付少锋,周利华.二次式距离上基于SVD的高维图像索引方法[J].中国图象图形学报,2006,11(4):498-503. 被引量：5
4郑明玲,许柯,刘衡竹,魏登萍,李宝峰.基于重叠区域的高性能近似kD树算法[J].计算机辅助设计与图形学学报,2015,27(6):1053-1059. 被引量：3
5崔江涛,孙君顶,周利华.基于相关反馈的高维图像检索方法[J].西安电子科技大学学报,2006,33(1):62-65. 被引量：1
6王治丹,蒋建国,齐美彬.类属图密集近邻搜索的视觉跟踪算法研究[J].传感器与微系统,2017,36(4):146-149.
7王梅,朱信忠,赵建民,黄彩锋.基于Hadoop的海量图像检索系统[J].计算机技术与发展,2013,23(1):204-208. 被引量：10
8路松峰,卢正鼎.从关系数据库中快速发现候选关键字[J].计算机工程与应用,2000,36(9):17-19. 被引量：1
9王伟勤,钟敬堂.对Apriori算法的一种改进[J].佛山科学技术学院学报（自然科学版）,2007,25(2):54-57. 被引量：3
10章丽玲,陈晓苏.网上阅卷系统中Cache技术的研究[J].计算机系统应用,2007,16(9):106-109.

系统工程与电子技术

2006年第12期

浏览历史

内容加载中请稍等...

一种新的基于主分量排序的高维索引结构

参考文献13

二级参考文献18

共引文献10

相关作者

相关机构

相关主题

浏览历史