摘要
为了解决海量XML数据查询的问题,提出了MapReduce编程模型下多谓词选择的查询处理方法.该方法并行查询海量XML数据,产生的并行查询结果满足用户给定的多谓词查询要求.提出海量XML数据的存储方法,将海量XML数据划分为众多XML数据块存储到HDFS中.提出MapReduce编程模型下基于多谓词选择的Map逻辑算法和Reduce逻辑算法,实现海量XML数据的并行查询处理.进一步提出基于多谓词选择的MapReduce查询优化方法,减少系统的数据传输量,提高了系统的性能.最后,通过实验验证了所提方法的有效性.
In order to resolve the problem of query for massive XM L data,a processing method of parallel query for massive XM L data based on multi-predicate selectivity under M apReduce programming model is proposed. The produced parallel query results can satisfy query request of user's given multi-predicate selectivity. The storage method of massive XM L data is proposed. The massive XM L data is partitioned into many XM L data blocks and loaded on HDFS. The M ap logic algorithm and the Reduce logic algorithm based on multi-predicate selectivity under M apReduce programming model are proposed,and they can realize parallel query processing for massive XM L data. Furthermore,a method of query optimization using M apReduce based on multi-predicate selectivity is proposed. The method can reduce the amount of data transmission and improve the performance of the system. Finally,the efficiency and effectiveness of the approach are also demonstrated by experimental results.
出处
《小型微型计算机系统》
CSCD
北大核心
2015年第7期1415-1420,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61370075
60873010)资助
辽宁大学青年科研基金项目(2012LDQN19)资助