期刊文献+

双目标优化的RDF图分割算法 被引量:2

RDF graph partitioning algorithm based on double objective optimization
下载PDF
导出
摘要 分布式存储是解决大规模数据存储的一种比较有效的方法,而数据分割是实现分布式存储的前提。面对不断增长的RDF数据,提出一种基于双目标优化的RDF图分割算法(RDF Graph Partitioning algorithm based on Double Objective Optimization,RGPDOO)。RGPDOO将边割和分割平衡两项图分割指标融合到一个目标函数,并依据此目标函数,实现了RDF图的静态和动态分割。其中静态图分割通过对图进行初始划分,将图中顶点分成内核顶点、交叉顶点和自由顶点三类。然后通过计算目标函数增益对交叉和自由顶点进行分配。动态图分割部分,针对RDF元组的插入和删除给出相应的解决方案。同时,为了满足图分割目标,算法每隔一段时间T会根据子图的平衡性和紧密性进行一次动态调整。实验选择合成和真实数据集进行测试,并分别与几种通用的静态和动态图分割算法进行比较。实验结果表明提出的算法能够有效地实现RDF图的静态和动态分割。 Distributed storage is a more effective method for the mass data storage. And, the data partitioning is the premise of distributed storage. In facing of the growing semantic web data, RDF Graph Partitioning algorithm is proposed by Double Objective Optimization(RGPDOO). RGPDOO fuses edge cut and load balancing together to get an objective function. According to this objective function, RGPDOO achieves static and dynamic partitioning of RDF graph. For the static partitioning, an initial partitionis executed to divide the node into three kinds:kernel nodes, boundary nodes and freedom nodes. And then, the boundary and freedom nodes are distributed to apartition with the max gain of objective function. For the dynamic partitioning, the insertion and deletion solution of triples are given by the objective function.And, RGPDOO will execute a dynamic adjustment at a certain time interval according to the balance and tightness of partitioning subgraph to satisfy the partitioning object. Finally, the algorithm is tested on synthetic and real datasets in comparison with several general graph partitioning algorithms. The experimental results show that RGPDOO is more suitable for RDF graph partitioning.
出处 《计算机工程与应用》 CSCD 北大核心 2017年第21期24-31,53,共9页 Computer Engineering and Applications
基金 国家自然科学基金(No.U1301253 No.61672123) 广东省科技计划(No.2015B010110006) 国家重点研发计划(No.2016YFD0800300) 辽宁省博士科研启动基金项目(No.201601348 No.201601349)
关键词 RDF图 静态分割 动态分割 边割 负载均衡 RDF graph static partitioning dynamic partitioning edge cut load balancing
  • 相关文献

参考文献2

二级参考文献23

  • 1Borthakur D. HDFS architecture guide. HADOOP APACHEPROJECT. http://hadoop. apache, org/common/docs/current/hdfs design, pdf.
  • 2Chang F, Dean J, Ghemawat S,et al. Bigtable: A distributedstorage system for structured data. ACM Trans, on ComputerSystems (TOCS), 2008,26(2): 4.
  • 3Webber J. A programmatic introduction to neo4j. Proc. of the3rd Annual Conference on Systems, Programming, andApplications: Software for Humanity. ACM.2012. 217-218.
  • 4Xin RS, Gonzalez JE, Franklin MJ,et al. Graphx: A resilientdistributed graph system on spark. First InternationalWorkshop on Graph Data Management Experiences andSystems. ACM. 2013. 2.
  • 5Low Y,Bickson D, Gonzalez J, et al. Distributed GraphLab: Aframework for machine learning and data mining in the cloud.Proc. of the VLDB Endowment, 2012, 5(8): 716-727.
  • 6Malewicz Q Austem MH, Bik AJC, et al. Pregel: A systemfor large-scale graph processing. Proc. of the 2010 ACMSIGMOD International Conference on Management of Data.ACM. 2010. 135-146.
  • 7Shao B, Wang H, Li Y. Trinity: A distributed graph engine ona memory cloud. Proc. of the 2013 ACM SIGMODInternational Conference on Management of Data. ACM.2013.505-516.
  • 8Garey MR, Johnson DS, Stockmeyer L. Some simplifiedNP-complete graph problems. Theoretical Computer Science,1976,1(3): 237-267.
  • 9Kemighan BW, Lin S. An efficient heuristic procedure forpartitioning graphs. Bell System Technical Loumal, 1970,49(2): 291-307.
  • 10Fiduccia CM, Mattheyses RM. A linear-time heuristic forimproving network partitions. 19th Conference on DesignAutomation. IEEE. 1982. 175-181.

共引文献34

同被引文献19

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部