基于XML索引技术的有效外延连接

Efficient Extension Join Algorithm for Querying XML Data Based on Index Techniques

下载PDF

导出

摘要首先给出了XML文档树、元素外延和名字路径等的形式化定义.接着,将编码方案、路径索引和名字外延的思想相结合,提出了一种改进的XML数据的索引结构(类型索引集、名字索引集和外延索引),解决了基于传统索引技术的XML数据查询方法性能上的不足.它既可以有效地支持结构连接的计算以快速地判断任意结点之间的子孙后代关系,也可以有效地支持基于名字外延的路径连接算法以快速地判断任意结点之间的父子关系,然后还可以快速地支持对包含拥有关系的小枝查询;进而给出了基于该索引结构的外延连接算法,并着重对其处理含有父子关系和拥有关系等较复杂的XPath查询路径的不同处理过程进行了对比和分析,使得对于一条长度为n的XPath绝对路径查询,最多只需要n/2-1次外延连接,且能够根据双亲结构信息等利用外延索引尽可能跳过不需要参与连接的结点.实验结果表明,提出的新的索引结构可以有效地提高查询处理的性能. Firstly, formal definitions of XML tree data model, element extension, name path, etc are given in context environment. Secondly, an improved index structure, including type index set, name index set and extension index, is proposed to retrieve XML data based on the idea of the numbering scheme, the path index and name extension. The index structure solves the problem of the poor performance of XML query based on conventional index techniques. It can not only quickly determine ancestor/descendant relationships by supporting the structural join algorithm, but also quickly determine parent/child relationships by supporting the path join algorithm based on name extension, and meanwhile effectively process twig query including holding relationships. Finally, an extension join algorithm based on this index structure is proposed to process XPath path query efficiently, in which comparisons and analyses among the different processing approaches for complicated XPath path expressions with parent/child and holding relationships are conducted. For an XPath absolute path query with n nodes, the extension join is needed to execute for n/2-1 times at most by avoiding scanning each node which does not participate in the join via extension index based on structure information, etc. Experimental results show that the new index structure can effectively enhance the query performance for XML data.

作者姜学锋衡星辰覃征邵利平

机构地区西北工业大学计算机学院西安交通大学电子与信息工程学院

出处《计算机研究与发展》 EI CSCD 北大核心 2008年第6期1043-1055,共13页 Journal of Computer Research and Development

基金国家“九七三”重点基础研究发展规划基金项目(2004CB719401)~~

关键词 XML文档 XML索引结构多模态外延连接 XPATH XML document XML index structure multi-modal extension join XPath

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献17

1S-Y Chien, Z Vagena, D Zhang, et al. Efficient structural joins on indexed XML documents [C]. VLDB Z00Z, Hong Kong, 2002.
2H Jiang, H Lu, W Wang, et al. XR Tree:Indexing XML data for efficient structural joins [C]. In: Proc of the 19th Int' l Conf on Data Engineering (ICDE' 03). Los Alamitos, CA: IEEE Computer Society Press, 2003. 253-264.
3N Bruno, N Koudas, D Srivastava. Holistic twig joins: Optional XML pattern matching [C]. ACM SIGMOD 2002, Madison, Wisconsin, 2002.
4Q Li, B Moon. Indexing and querying XML data for regular path expressions [C]. VLDB 2001, Roma, 2001.
5贾福林,王国仁,于戈.基于DOM的XML数据库的索引技术研究[J].计算机研究与发展,2004,41(1):175-186. 被引量：18
6万常选,刘云生,徐升华,刘喜平,林大海.基于区间编码的XML索引结构的有效结构连接[J].计算机学报,2005,28(1):113-127. 被引量：38
7Qinghua Zou, Shaorong Liu. Ctree:A compact tree for indexing XML data [C]. WIDM' 04, Washington DC, 2004.
8包小源,宋再生,唐世渭,杨冬青,王腾蛟.SuffIndex——一种基于后缀树的XML索引结构[J].计算机研究与发展,2004,41(10):1793-1801. 被引量：7
9C Zhang, J Naughton, D DeWitt, et al. On supporting containment queries in relational database management systems [C]. ACM SIGMOD 2001, Santa Barbara, California, 2001.
10Jianhua Lv, Guoren Wang, Jeffrey X Yu, etal. Performance evaluation of a DOM-based XML database: Storage, indexing and query optimization [C]. The 3rd Web Age Information Management Conf (WAIM2002), Beijing, 2002.

二级参考文献32

1Wan,Chang-xuan,Liu,Yun-Sheng.X-RESTORE: Middleware for XML's Relational Storage and Retrieve[J].Wuhan University Journal of Natural Sciences,2003,8(01A):28-34. 被引量：4
2万常选,刘云生,徐升华,林大海.基于X-RESTORE查询XML视图[J].小型微型计算机系统,2004,25(10):1870-1875. 被引量：2
3[1]T Bray, J Paoli, C Sperberg-McQueen. Extensible Markup Language(XML) 1.0. 1998. http:∥www.w3.org/TR/1998/REC-xml-19980210
4[2]Jonathan Robie, Arnaud Le Hors. Document Object Model level 2. 2000. http:∥www.w3.org/TR/2000/REC-DOM2
5[3]Don Chamberlin, James Clark. XQuery 1.0: An XML Query Language. W3C Working Draft 07, 2001. http:∥www.w3.org/TR/2001 /WD-xquery-20010607
6[4]J Cark, S DeRose. XMP path language(XPath), ver 1.0. World Wide Web Consortium, Tech Rep: REC-xpath-19991116, 1999
7[5]Don Chamberlin, Jonathan Robie, Daniela Florescu. Quilt: An XML query language for heterogeneous data sources. The Int'l Workshop on the Web and Databases(WebDB'2000), Dallas, TX, 2000
8[6]Alin Deutsch, Mary Fernandez, Daniela Florescu .et al.. A query language for XML. The 8th Int'l World Wide Web Conf, Toronto, 1999
9[7]J Robie, J Lapp, D Schach. XML Query Language(XQL), 1998. http:∥www.w3.org/TandS/QL/QL98/cfp.S
10[8]Abiteboul, D Quass, J McHugh .et al.. The Lorel query language for semistructured data. Int'l Journal on Digital Libraries, 1997, 1(1): 68～88

共引文献56

1周理,刘琰.XRes:基于关键字的XML查询引擎[J].科技资讯,2007,5(35):88-89.
2娄芳,白似雪,于海雯.基于路径定位的XML数据库存储策略[J].南昌大学学报（理科版）,2005,29(3):298-300. 被引量：1
3王建仁,徐文静.一种新的基于XML的关系数据映射索引技术[J].微型机与应用,2005,24(9):12-14.
4曾瑞.基于Visual Basic环境下的XML信息挖掘与开发方法研究[J].科技广场,2005(10):56-59.
5徐文静,崔俊凯.基于XML架构的关系数据映射索引机制的研究[J].情报杂志,2005,24(12):65-67. 被引量：1
6焦建军,阮秋琦.一种实用的XML模式与关系型模式的数据交换模型及其实现[J].现代计算机,2006,12(1):78-81. 被引量：1
7易平,胡运安,陈福生,张世永.基于PATRICIA-TRIES的XML路径索引设计[J].小型微型计算机系统,2006,27(3):474-480. 被引量：2
8王卓,刘大昕,冯晓宁.基于框架的软构件后缀索引构造方法研究[J].哈尔滨工程大学学报,2006,27(1):107-111. 被引量：1
9杨悦欣,陈桦,孙波.基于ASP与XML的设施农业数据录入系统[J].陕西科技大学学报（自然科学版）,2006,24(1):65-69.
10严海兵,吴银芳,崔志明.基于结构聚类的XML数据存储研究[J].苏州科技学院学报（自然科学版）,2006,23(3):68-72. 被引量：1

1李环,衡星辰,覃征.一种具有多模态索引结构的XML查询算法[J].小型微型计算机系统,2008,29(5):807-812. 被引量：1
2万常选,刘云生,徐升华,刘喜平,林大海.基于区间编码的XML索引结构的有效结构连接[J].计算机学报,2005,28(1):113-127. 被引量：38
3长期备份策略存储数据100年？[J].新电脑,2009(1):144-147.
4包小源,宋再生,唐世渭,杨冬青,王腾蛟.SuffIndex——一种基于后缀树的XML索引结构[J].计算机研究与发展,2004,41(10):1793-1801. 被引量：7
5李新叶,李新芳.基于XML的供电企业分布式异构信息的检索[J].电力系统通信,2008,29(3):56-59. 被引量：1
6许娴.XML数据库的查询技术研究[J].江苏技术师范学院学报,2013,19(4):7-12. 被引量：1
7曾一,许娴,张元平.一种基于Schema的XML索引结构[J].计算机工程,2006,32(18):64-66. 被引量：8
8杨梅.一种基于OB树的XML索引结构[J].科学技术与工程,2008,8(1):242-245.
9杨梅,闵应骅.一种基于模式的XML索引结构[J].湘潭师范学院学报（自然科学版）,2007,29(3):22-24. 被引量：1
10任建华,周建,孟祥福,魏珂.基于关键字之间结构关系的XML查询结果排序方法[J].计算机科学,2013,40(6):178-182. 被引量：2

计算机研究与发展

2008年第6期

浏览历史

内容加载中请稍等...

基于XML索引技术的有效外延连接

参考文献17

二级参考文献32

共引文献56

相关作者

相关机构

相关主题

浏览历史