期刊文献+

图数据格式对三角形计数算法影响的特性分析 被引量:1

Characteristic Analysis of the Influence of Graph Data Format on Triangle Counting Algorithm
下载PDF
导出
摘要 图计算由当前图应用与输入图数据驱动,而图应用各式各样、图结构也千差万别,相同图应用处理不同图时性能差异巨大.为探究图数据格式对图算法的性能影响,本文选取5种常用的图数据格式COO、CSC、CSR、DCSC和CSCI以及社区发现算法三角形计数在图数据p2p-Gnutella04、p2p-Gnutella06、soc-Epinions1上的应用作为分析对象,定义了图数据格式对图计算系统影响的性能指标,包括执行时间、数据移动量、计算量、功耗和各级cache MPKI等,基于Skylake Xeon(R)Platinum 8164处理器进行性能事件采集.实验结果表明,TC在COO、CSC、CSR、DCSC和CSCI格式下运行p2p-Gnutella04/06、soc-Epinions1图数据的执行时间(归一化到最长执行时间)之比为35.7%、0.04%、0.15%、9.7%、100%与34.1%、0.05%、1.81%、9.76%、100%和9.49%、0.92%、0.99%、9.1%、100%,数据移动量(归一化到最大数据移动量)之比为74.9%、3.7%、4.5%、20.32%、100%与100%、0.65%、0.81%、27.37%、13.43%和97.08%、42.94%、42.95%、86.38%、100%,计算量(归一化到最大计算量)之比为39.36%、6.5%、8.62%、10.68%、100%与31.6%、6.97%、8.64%、8.67%、100%和100%、0.9%、0.89%、28.09%、33.07%,功耗(归一化到最大功耗)之比为100%、57.39%、47.73%、33.24%、75.28%与37.03%、84.7%、40.8%、43.4%、100%和100%、34.77%、29.01%、28.39%、86%.实验结果对于为TC应用的输入图数据格式选择提供了依据. The graph calculation is driven by the current graph application and the input graph data.There are various graph applications and graph structures.The performance of the same graph application when processing different graphs varies greatly.In order to explore the impact of graph data format on graph algorithm performance, this paper selects five commonly used graph data formats COO,CSC,CSR,DCSC and CSCI and community discovery algorithm triangle count in the graph data p2 p-Gnutella04,p2 p-Gnutella06,soc-Epinions1 As the analysis object, the above application defines the performance indicators that the graph data format affects the graph computing system, including execution time, data movement, calculation, power consumption, and various levels of cache MPKI,etc.,based on Skylake Xeon(R)Platinum 8164 processing the device performs performance event collection.The experimental results show that the ratio of execution time(normalized to the longest execution time)of TC running p2 p-Gnutella04/06 and soc-Epinions1 graph data in COO,CSC,CSR,DCSC and CSCI formats is 35.7% and 0.04%,0.15%,9.7%,100% and 34.1%,0.05%,1.81%,9.76%,100% and 9.49%,0.92%,0.99%,9.1%,100%,data movement amount(normalized to the maximum data The ratio of movement amount)is 74.9%,3.7%,4.5%,20.32%,100% and 100%,0.65%,0.81%,27.37%,13.43% and 97.08%,42.94%,42.95%,86.38%,100%,The ratio of the calculation amount(normalized to the maximum calculation amount)is 39.36%,6.5%,8.62%,10.68%,100% and 31.6%,6.97%,8.64%,8.67%,100% and 100%,0.9%,0.89%,28.09%,33.07%,the ratio of power consumption(normalized to maximum power consumption)is 100%,57.39%,47.73%,33.24%,75.28% and 37.03%,84.7%,40.8%,43.4%,100% and 100%,34.77%,29.01%,28.39%,86%.The experimental results provide a basis for the selection of the input image data format for TC applications.
作者 张世茹 邓军勇 ZHANG Shi-ru;DENG Jun-yong(School of Electronic Engineering,Xi′an University of Posts and Telecommunications,Xi′an 710121,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2023年第1期103-109,共7页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61834005,61602377,61772417,61634004,61802304)资助 陕西省科技统筹创新工程项目(2016KTZDGY02-04-02)资助 陕西省重点研发计划(2017GY-060,2022GY-027)资助。
关键词 图计算 图数据格式 三角形计数 性能指标 特性分析 graph calculation graph data format triangle count performance index characteristic analysis
  • 相关文献

参考文献3

二级参考文献28

  • 1Jeffrey Dean,Sanjay Ghemawat.MapReduce[J]. Communications of the ACM . 2008 (1)
  • 2ChengJ F,Liu Q,Li Z G,et al.VENUS:vertex-centric streamlined graph computation on a single PC. Proceedings of the 31st IEEE International Conference on Data Engineering . 2015
  • 3Yuan P P,Zhang W Y,Xie C F,et al.Fast iterative graph computation:a path centric approach. Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis . 2014
  • 4Yan D,Cheng J,Lu Y,et al.Blogel:a block-centric framework for distributed computation on real-world graphs. Proceedings of the VLDB Endowment . 2014
  • 5A.KYROLA,G.BLELLOCH,C.GUESTRIN.GraphChi:Large-scale graph computation on just a PC. Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation . 2012
  • 6Leslie G. Valiant.??A bridging model for parallel computation(J)Communications of the ACM . 1990 (8)
  • 7Zheng D,Mhembere D,Burns R,et al.FlashGraph:processing billion-node graphs on an array of commodity SSDs. Proceedings of the 13th USENIX Conference on File and Storage Technologies . 2015
  • 8https://giraph.apache.org/ .
  • 9http://tech.qq.com/a/20140725/000288.htm .
  • 10Han W S,Lee S,Park K,et al.TurboGraph:a fast parallel graph engine handling billion-scale graphs in a single PC. Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . 2013

共引文献6

同被引文献11

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部