摘要
为了解决海量RDF(Resource Description Framework)数据的存储查询效率问题,基于关系数据库、直接索引、图3种不同的存储方案,对开源数据库Virtuoso、TDB、Neo4j的RDF数据存储方法进行了研究。采用Freebase的27亿三元组为实验数据,对比分析了这3种数据库的存储查询效率。综合存储查询效率、SPARQL支持、可扩展性等因素,得出三者中Virtuoso是处理海量RDF数据存储与查询的最佳方案,并在研究过程中提出了RDF在图数据库Neo4j中的存储查询方法。
In order to improve the storage and query efficiency of massive RDF( Resource Description Framework) data,three open source databases including Virtuoso,TDB and Neo4 j were studied based on three different storage categories: the relation database category,the direct index category and the graph category. A contrast experiment was made to compare the storage and query efficiency of these three databases. The experiment took 2. 7 billion triples of Freebase as experimental data. Synthesizing the storage and query efficiency,SPARQL support,scalability and other factors,we came to a conclusion that Virtuoso is the best in these three databases to deal with the storage and query of massive RDF data. And a method of storage and query for RDF data in graph database Neo4 j was put forward.
出处
《北京信息科技大学学报(自然科学版)》
2017年第3期63-69,共7页
Journal of Beijing Information Science and Technology University
基金
863计划课题"面向基础教育的知识能力智能测评与类人答题验证系统"(2015AA015409)