摘要
图是一种非常重要的数据结构,能够充分描述自然界中各事物之间的联系和依赖属性,因此图在计算机领域中应用广泛。很多诸如网络路由、网络流等问题都可以在图论的支撑下,借助相关算法得到高效解决。随着Web2.0、大数据、社交网络、机器学习和数据挖掘等技术的高速发展,很多领域抽象出来的图规模呈指数级增长,图中的节点、边及权重爆发式地达到亿万级别,对图计算性能提出了新的要求。文中从图计算框架理论基础BSP框架分析,剖析了目前的分布式图处理平台处理海量Natural Graphs的算法与性能,提出将图中边组织并组到一个"grid"中展示和图分割模式的GridGraph图计算系统。实验结果表明,GridGraph系统的图计算性能超越了单机图计算系统,甚至比需要更多资源的主流分布式图形处理系统更快。
The graph is a very important data structure,which can fully describe the relationship between things in nature and their dependency attributes.Therefore,graph is widely used in the field of computer.Many problems such as network routing and network flow can be efficiently solved by relevant algorithm under the support of graph theory.With the rapid development of technology such as Web2.0,big data,social network,machine learning and data mining,the size of graphs abstracted from many fields increases exponentially,and the nodes,edges and weights of graphs explode to billions of levels,which puts forward new requirements for graph computing performance.Based on the BSP framework,we analyze the algorithm and performance of the current distributed graph processing platform when processing large scale Natural Graphs and put forward the idea of organizing and grouping the nodes and edge lines of a graph into two different grids to be displayed using GridGraph calculation system.The experiment shows that the performance of the GridGraph system go beyond the stand-alone graph computing system,even faster than the popular distributed graphics processing system which requires more resources.
作者
黄承宁
HUANG Cheng-ning(Nanjing Tech University Pujiang Institute,Nanjing 211222,China)
出处
《计算机技术与发展》
2019年第5期187-191,共5页
Computer Technology and Development
基金
2017年江苏省高校哲学社会科学研究项目(2017SJB2096)
全国青年教师教育教学研究学术委员会2017年度全国青年教师教育教学研究课题(2017QNJ041)
2017年校级教育教学改革研究重点课题(2017JG003Z)