摘要
随着嵌入维数的增大,排列熵(permutation entropy,PE)算法的运算规模将会成倍增加,对计算的时效性提出了更高的要求。针对国际上首台计算性能超过100P的神威·太湖之光异构众核超级计算机,提出一种针对排列熵算法移植和并行化方法,核组之间基于MPI对相空间矩阵进行数据划分,核组内部基于OpenACC实现划分区域内部并行;然后针对SW26010众核处理器结构特征,调整减少主从核通信次数和消除原子操作,将排列熵算法成功移植并加速;最后通过大坝振荡数据进行测试。测试结果表明,该方法能够很好地发挥SW26010众核处理器加速优势,单核组性能较主核版本最高可获得7.18倍加速,同时在神威·太湖之光大规模集群上进行强可扩展性分析,128核组时最高实现了85.6倍的性能提升。
With the increase of embedding dimension,the computing scale of the permutation entropy algorithm will grow exponentially.Consequently,it puts forward higher requirements for the timeliness of calculation.The Sunway TaihuLight supercomputer is a totally independently designed and developed Chinese supercomputer with a new many-core processor,the SW26010.To fully take advantages of computing and storage resources of heterogeneous many-core cluster,the phase space matrix is divided between the core groups base on MPI,and internal parallelism bases on OpenACC in the core groups.By these methods,the permutation entropy algorithm is successfully transplanted and accelerated.The experimental results on dam oscillation dataset show that the performance of the single-core group version is 7.18 times faster than that of the MPE version,and the speedup ratio can reach 85.6 when using 128 core groups.
作者
张浩
花嵘
于建志
梁建国
冯鲁彬
Zhang Hao;Hua Rong;Yu Jianzhi;Liang Jianguo;Feng Lubin(College of Computer Science&Engineering,Shandong University of Science&Technology,Qingdao Shandong 266590,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第7期2022-2026,共5页
Application Research of Computers
基金
国家重点研发计划项目子课题(2017YFB0202002)
山东省自然科学基金资助项目(ZR2018BF001)。