摘要
图计算由当前图应用与输入图数据驱动,而图应用各式各样、图结构也千差万别,相同图应用处理不同图时性能差异巨大.为探究图数据格式对图算法的性能影响,本文选取5种常用的图数据格式COO、CSC、CSR、DCSC和CSCI以及社区发现算法三角形计数在图数据p2p-Gnutella04、p2p-Gnutella06、soc-Epinions1上的应用作为分析对象,定义了图数据格式对图计算系统影响的性能指标,包括执行时间、数据移动量、计算量、功耗和各级cache MPKI等,基于Skylake Xeon(R)Platinum 8164处理器进行性能事件采集.实验结果表明,TC在COO、CSC、CSR、DCSC和CSCI格式下运行p2p-Gnutella04/06、soc-Epinions1图数据的执行时间(归一化到最长执行时间)之比为35.7%、0.04%、0.15%、9.7%、100%与34.1%、0.05%、1.81%、9.76%、100%和9.49%、0.92%、0.99%、9.1%、100%,数据移动量(归一化到最大数据移动量)之比为74.9%、3.7%、4.5%、20.32%、100%与100%、0.65%、0.81%、27.37%、13.43%和97.08%、42.94%、42.95%、86.38%、100%,计算量(归一化到最大计算量)之比为39.36%、6.5%、8.62%、10.68%、100%与31.6%、6.97%、8.64%、8.67%、100%和100%、0.9%、0.89%、28.09%、33.07%,功耗(归一化到最大功耗)之比为100%、57.39%、47.73%、33.24%、75.28%与37.03%、84.7%、40.8%、43.4%、100%和100%、34.77%、29.01%、28.39%、86%.实验结果对于为TC应用的输入图数据格式选择提供了依据.
The graph calculation is driven by the current graph application and the input graph data.There are various graph applications and graph structures.The performance of the same graph application when processing different graphs varies greatly.In order to explore the impact of graph data format on graph algorithm performance, this paper selects five commonly used graph data formats COO,CSC,CSR,DCSC and CSCI and community discovery algorithm triangle count in the graph data p2 p-Gnutella04,p2 p-Gnutella06,soc-Epinions1 As the analysis object, the above application defines the performance indicators that the graph data format affects the graph computing system, including execution time, data movement, calculation, power consumption, and various levels of cache MPKI,etc.,based on Skylake Xeon(R)Platinum 8164 processing the device performs performance event collection.The experimental results show that the ratio of execution time(normalized to the longest execution time)of TC running p2 p-Gnutella04/06 and soc-Epinions1 graph data in COO,CSC,CSR,DCSC and CSCI formats is 35.7% and 0.04%,0.15%,9.7%,100% and 34.1%,0.05%,1.81%,9.76%,100% and 9.49%,0.92%,0.99%,9.1%,100%,data movement amount(normalized to the maximum data The ratio of movement amount)is 74.9%,3.7%,4.5%,20.32%,100% and 100%,0.65%,0.81%,27.37%,13.43% and 97.08%,42.94%,42.95%,86.38%,100%,The ratio of the calculation amount(normalized to the maximum calculation amount)is 39.36%,6.5%,8.62%,10.68%,100% and 31.6%,6.97%,8.64%,8.67%,100% and 100%,0.9%,0.89%,28.09%,33.07%,the ratio of power consumption(normalized to maximum power consumption)is 100%,57.39%,47.73%,33.24%,75.28% and 37.03%,84.7%,40.8%,43.4%,100% and 100%,34.77%,29.01%,28.39%,86%.The experimental results provide a basis for the selection of the input image data format for TC applications.
作者
张世茹
邓军勇
ZHANG Shi-ru;DENG Jun-yong(School of Electronic Engineering,Xi′an University of Posts and Telecommunications,Xi′an 710121,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2023年第1期103-109,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61834005,61602377,61772417,61634004,61802304)资助
陕西省科技统筹创新工程项目(2016KTZDGY02-04-02)资助
陕西省重点研发计划(2017GY-060,2022GY-027)资助。
关键词
图计算
图数据格式
三角形计数
性能指标
特性分析
graph calculation
graph data format
triangle count
performance index
characteristic analysis