摘要
现有的RDF数据分布式并行压缩编码算法均未考虑结合本体文件,导致编码后的RDF数据没有表示任何语义信息,不利于分布式查询或推理。针对这些问题,提出SCOM(Semantic Coding with Ontology on MapReduce)算法在分布式MapReduce下完成RDF数据的语义并行编码。该算法首先结合RDF数据本体,构建类关系和属性关系模型;在三元组项分类与过滤之后,对三元组项进行编码并生成字典表,最终完成RDF数据带有语义信息且具有规律性的编码。此外,SCOM算法能够很容易地将编码后的RDF数据文件恢复为原始文件。实验表明,SCOM算法能够高效地实现大规模数据的分布式并行编码。
The existing distributed parallel compression coding algorithms for RDF data do not consider combining with the ontology file, resulting in encoded RDF data without any semantic information, which is not conducive to the distribu- ted query or reasoning. To solve these problems, a method named SCOM (Semantic Code with Ontology on MapRe- duce) was proposed to complete the semantic parallel coding for RDF data. Firstly, the algorithm combines the ontology of RDF data to build the class and attribute relationship model. The triple items are encoded and a dictionary table is generated after classifying and filtering triples. Finally, the coding for RDF data with semantic information and regulari- ties is completed. In addition, SCOM algorithm can easily revert the encoded RDF data file to their original file. Experi- mental results show that SCOM algorithm can achieve the parallel coding of large-scale data efficiently.
出处
《计算机科学》
CSCD
北大核心
2016年第9期197-202,212,共7页
Computer Science
基金
国家青年基金项目(61300104)
福建省科技拥军基金项目(JG2014001)
福建省自然科学基金项目(2012J01168)
福州大学科技发展基金资助项目(2013-XQ-32)资助