期刊文献+

一种并行XML数据库分片策略 被引量:7

A Data Placement Strategy for Parallel XML Databases
下载PDF
导出
摘要 主要研究XML文档的并行数据分片策略,以便能够并行处理XML查询.为了描述XML数据分片,提出了媒介节点的概念.一组媒介节点的集合可以将一棵XML数据树分割成一棵根树和一组子树的集合:根树将在所有站点中复制;而子树集合则可以根据用户查询的工作负载被均匀地分片到各个站点中.对于同一棵XML数据树,会有很多种媒介节点的集合;而不同的媒介节点集合会产生不同的数据分片结果.然后,依据各个数据分片中的用户查询工作量是否均衡,来衡量一个分片的好坏.选择一组最佳的媒介节点集合是一个NP-hard问题.为了解决此问题,设计了一组启发式优化规则.基于这一思想,提出并实现了一种基于媒介节点的XML数据分片算法WIN(workload-awareintermediarynodesdataplacementstrategy).大量实验结果证明:WIN算法的性能要优于以往的并行XML数据分片策略. This paper targets on parallel XML document partitioning strategies to process XML queries in parallel To describe the problem of XML data partitioning, a concept, intermediary node, is presented in this paper. By a set of intermediary nodes, an XML data tree can be partitioned into a root-tree and a set of sub-trees. While the root-tree is duplicated over all the nodes, the set of the sub-trees can be evenly partitioned over all the nodes based on the workload of user queries. For the same XML data tree, there are a number of intermediary nodes sets, and different intermediary nodes sets will generate different partitions. It can be evaluated if a partitioning is good based on the workload of user queries. It is obviously an NP hard problem to choose an optimal partitioning. To solve this problem, this paper proposes a set of heuristic rules. Based on the idea described above, this paper designs and implements an XML data partitioning algorithm, WIN, and the extensive experimental results show that its speedup and scaleup performances outperform the existing strategies.
出处 《软件学报》 EI CSCD 北大核心 2006年第4期770-781,共12页 Journal of Software
基金 国家自然科学基金 国家教育部博士点基金~~
关键词 并行数据库 XML文档 工作负载 数据分片 媒介节点 parallel database XML document workload data partitioning intermediary node
  • 相关文献

参考文献18

  • 1Lomet DB,Salzberg B.The HB-Tree:A multiattribute indexing method with good guaranteed performance.ACM Trans.on Database Systems,1990,15(4):625-658.
  • 2Berchtold S,Keim DA,Kriegel H.The x-tree:An index struct for high-dimensional data.In:Vijayaraman TM,Buchmann AP,Mohan C,Sarda NL,eds.Proc.of the 22nd VLDB Conf.Bombay:Morgan Kaufmann Publishers,1996.28-30.
  • 3Mehta M,DeWitt DJ.Data placement in shared-nothing parallel database systems.VLDB Journal,1997,6(1):53-72.
  • 4He Z,Yu JX.Declustering and object placement in parallel OODBMS.In:Roddick JF,ed.Proc.of the 10th Australasian Database Conf.,ADC'99.Auckland,1999.18-21.
  • 5Ghandeharizadeh S,Wilhite D,Lin K,Zhao X.Object placement in parallel object-oriented database systems.In:Agrawal R,Dittrich KR,eds.Proc.of the 10th Int'l Conf.on Data Engineering.Houston:IEEE Computer Society,1994.253-262.
  • 6Berglund A,Boag S,Chamberlin D,Fernández MF,Kay M,Robie J,Siméon J.XML path languages (XPath),ver 2.0,W3C Working Draft,2001.Technical Report,WD-xpath20-20011220,W3C,2001.http://www.w3.org/TR/WD-xpath20-20011220
  • 7Boag S,Chamberlin D,Fernández MF,Florescu D,Robie J,Siméon J.XQuery 1.0:An XML query language,W3C working draft,2001.Technical Report,WD-xquery-20010607.World Wide Web Consortium.
  • 8Yu Y,Wang G,Yu G,Wu G,Hu J,Tang N.Data placement and query processing based on RPE parallelisms.In:Voas J,ed.Proc.of the IEEE COMPSAC 2003 Conf.Dallas:IEEE Computer Society,2003.151-157.
  • 9Chan CY,Garofalakis M,Rastogi R.RE-Tree:An efficient index structure for regular expressions.VLDB Journal,2003,12(2):102-119.
  • 10Chung C,Min J,Shim K.APEX:An adaptive path index for XML data.In:Halevy AY,Ives ZG,Doan AH,eds.Proc.of the 2002 ACM SIGMOD Conf.Madison:ACM,2002.121-132.

同被引文献62

引证文献7

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部