期刊文献+

基于差分编码的RDF分组压缩 被引量:1

RDF Grouping Compression Based on Delta Encoding
下载PDF
导出
摘要 语义网技术的发展使资源描述框架(RDF)的数据量迅速增长,导致其对存储空间与传输带宽的要求不断提高。现有的通用压缩方法和RDF专用压缩方法可以解决该问题,但仍存在数据冗余。为此,提出一种基于差分编码的RDF分组压缩算法。将RDF数据根据连接宾语的谓语组合进行分组,在消除宾语冗余的同时进一步减少谓语冗余。在此基础上,针对分组后得到的主语序列,通过引入差分编码技术进一步优化其存储空间。实验结果显示,与Plain、HDT和HDT++算法相比,该算法在结构化程度低的Archives Hub、Linkedmdb、rdfabout和DBpedia数据集中可获得平均17%的性能提升,在结构化程度高的dbtune数据集中可获得23%的性能提升,表明其对于不同结构化程度的数据集均具有较好的RDF压缩性能。 With the development of semantic Web technology,the volume of Resource Description Framework(RDF)data is increasing rapidly along with its demand for storage space and transmission bandwidth.Existing general compression methods and RDF-specific compression methods can solve this problem,but still suffer from a lack of data redundancy.To this end,this paper proposes an RDF grouping compression algorithm based on delta encoding.The algorithm groups RDF data according to the combination of predicates connected to the object,so as to further reduce predicate redundancy while eliminating object redundancy.On this basis,it further optimizes the storage space of the grouped subject sequence data by introducing delta coding technology.Experimental results show that,compared with the Plain,HDT and HDT++algorithm,this algorithm improves the performance by 17%on average in less structured datasets including Archives Hub,Linkedmdb,rdfabout and DBpedia,meanwhile improves performance by 23%on average in highly structured dataset dbtune,which demonstrates that the proposed algorithm has better RDF compression performance in datasets with different degrees of structure.
作者 伍伟鑫 韩京宇 朱曼 WU Weixin;HAN Jingyu;ZHU Man(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处 《计算机工程》 CAS CSCD 北大核心 2020年第11期117-123,共7页 Computer Engineering
基金 国家自然科学基金(61602260) 江苏省社科基金重点项目(18GLA004)。
关键词 语义网 资源描述框架 结构化程度 数据压缩 差分编码 sematic Web Resource Description Framework(RDF) degree of structure data compression delta encoding
  • 相关文献

参考文献3

二级参考文献32

  • 1World Wide Web Consortium: RDF/XML Syntax Specification (Revised) [OL]. [2004-02-10]. http://www. w3. org/TR/2004/REC-rdf-syntax-grammar.
  • 2W3C SWEO Community Project. Linking open data on the semantic Web [OL]. [2012-03-17]. http://www, w3. org/ wiki/SweoIG/TaskForces / Community- Projects/LinkingOpenData.
  • 3Broekstra J, Kampman A, Harmelen F. Sesame: A generic architecture for storing and querying RDF and RDF schema [G] //LNCS 2342: Proe of the 1st Int Semantic Web Conf. Berlin: Springer, 2002:54-68.
  • 4Weiss C, Karras P, Bernstein A. Hexastore: Sextuple indexing for semantic Web data management [C] //Proc of VLDB'2008. Trondheim, Norway: VLDB Endowment, 2008: 1008-1019.
  • 5Neumann T, Weikum G. Scalable join processing on very large RDF graphs [C] //Proc of ACM SIGMOD 2009. New York.. ACM, 2009:627-639.
  • 6Neumann T, Weikum G. The RDF-3X engine for scalable management of RDF data [J]. VLDB Journal, 2010, 19(1) 91-113.
  • 7Abadi D J, Marcus A, Madden S R, et al. Scalable semantic Web data management using vertical partitioning [C] //Proc of VLDB'2007. Trondheim, Norway: VLDB Endowment, 20071 411-422.
  • 8SWAT Projects of Lehigh University. LUBM[OL]. [2012 03-17]. http://swat, cse. lehigh, edu/projeets/lubm/.
  • 9Guo Y, Pan Z, Heflin J. LUBM: A benchmark for OWL knowledge base systems [J]. Journal of Web Semantics, 2005, 3(2/3) :158-182.
  • 10Wilkinson K, Sayers C, Kuno H, et al. Efficient RDF storage and retrieval in Jena2 [C] //Proc of the 1st Int Workshop on Semantic Web and Databases. 2003 :131-150.

共引文献29

同被引文献21

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部