摘要
随着语义网数据的爆炸式增长,如何高效地管理海量RDF数据成为一个关键问题.现有的集中式关系型RDF数据存储管理系统已难以适应这种需求,越来越多的研究者使用分布式系统和并行计算技术来管理海量RDF数据.提出一种基于分布式数据库HBase的RDF数据存储模型,根据OWL本体定义文件,将数据按类划分,同一类的三元组数据保存在该类的S_PO和O_PS两张表中,实现该存储模型上的8种Triple Pattern和Basic Graph Pattern查询算法,并提供部分推理功能,在Hadoop集群环境下对存储模型与查询算法进行了可行性验证.
With the rapid growth of the Semantic Web data,how to efficiently manage the massive RDF data becomes a key issue.The existing centralized relational RDF data storage management system has been difficult to meet this demand.Therefore,more and more researchers use distributed systems and parallel computing techniques to manage the massive RDF data.This work propose a novel RDF data storage model based on HBase and split data according to classes which defined by OWL.Triples that belong to the same class are stored both in the S_PO and O_PS tables of this class. Design Triple Pattern and Basic Graph Pattern query algorithms for this storage model which supports some inferences.Verify the feasibility of the storage model and query algorithm in the Hadoop cluster.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2013年第S1期23-31,共9页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61100040)
国家社会科学基金项目(11AZD121)
国家"八六三"高技术研究发展计划基金项目(2011AA01A202)