期刊文献+

CPU-MIC异构并行架构下基于大规模频繁子图挖掘的药物发现算法 被引量:2

A scalable CPU-MIC coordinated drug-finding tool by frequent subgraph mining
下载PDF
导出
摘要 频繁子图挖掘是许多实际应用领域中需要解决的重要问题,由于计算密集性、挖掘的图集及其结果容量大,现有的频繁子图挖掘方案无法满足时间需求,其处理效率是目前面临的主要挑战。原创性地提出了并行加速的频繁子图挖掘工具cmFSM。cmFSM主要在3个层次上进行并行优化:单节点上的细粒度OpenMP并行化、多节点多进程并行化和CPU-MIC协作并行化。在单节点上cmFSM的处理速度比基于CPU的最佳算法快一倍,在多节点方案中cmFSM提供可扩展性。结果表明,即使只使用一些并行计算资源,cmFSM也明显优于现有的最先进的算法。这充分表明提出的工具在生物信息学领域的有效性。 Frequent subgraph mining is an important issue to be solved in many practical fields. Due to the computational intensiveness, the mining of the atlas and the large capacity of the results, the existing solutions can not meet the time requirements, and its efficiency is currently the main challenge. The frequent subgraph mining tool cmFSM for parallel acceleration was originally proposed. cmFSM performs parallel optimization on three levels: fine-grained OpenMP parallelization on a single node, multi-node multi-process parallelization and CPU-MIC collaborative parallelization. cmFSM is twice as fast as the best CPU-based algorithm on a single node and provides scalability in a multi-node approach. In the future, we will continue to improve the scalability of multiple solutions.The results show that even with only a few parallel computing resources, cmFSM is significantly better than the most advanced algorithms available. This fully demonstrates the effectiveness of the proposed tool in the field of bioinformatics.
作者 彭绍亮 牛琦 李肯立 邹权 PENG Shaoliang;NIU Qi;LI Kenli;ZOU Quan(College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China;Institute of Fun dame ntal and Fron tier Sciences, Un iversity of Electronic Science and Tech no logy of China,Chengdu 610054, China)
出处 《大数据》 2019年第2期89-103,共15页 Big Data Research
基金 国家重点研发计划基金资助项目(No.2017YFB0202602 No.2018YFC0910405 No.2017YFC1311003 No.2016YFC1302500 No.2016YFB0200400 No.2017YFB0202104) 国家自然科学基金资助项目(No.61772543 No.U1435222 No.61625202 No.61272056)~~
关键词 频繁子图挖掘 生物信息学 并行算法 内存约束 同构 集成众核 frequent subgraph mining bioinformatics parallel algorithm memory constraints isomorphism many integrated core
  • 相关文献

参考文献1

二级参考文献28

  • 1汪卫,周皓峰,袁晴晴,楼宇波,施伯乐.基于图论的频繁模式挖掘[J].计算机研究与发展,2005,42(2):230-235. 被引量:17
  • 2李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,18(10):2469-2480. 被引量:35
  • 3Borgelt C, Berthold M R, Patterson D E. Molecular fragment mining for drug discovery [G] //Symbolic and Quantitative Approaches to Reasoning with Uncertainty. Berlin: Springer, 2005 : 1002-1013.
  • 4Guralnik V, Karypis G. A scalable algorithm for clustering sequential data [C] //Proc of the 1st IEEE Int Conf on Data Mining. Piscataway, NJ: IEEE, 2001:179-186.
  • 5Yan X, Yu P S, Han J. Graph indexing: A frequent structure-based approach [C] //Proc of the 17th ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2004: 335-346.
  • 6Liu Y, Jiang X, Chen H, et al. Mapreduce-based pattern finding algorithm applied in motif detection for prescription compatibility network [G] //Advanced Parallel Processing Technologies. Berlin: Springer, 2009: 341-355.
  • 7Shahrivari S, Jalili S. Distributed discovery of /requent subgraphs of a network using MapReduce [OL]. [2015-03- 25]. http://link, springer, corn/article/10. 1007/s00607-015 0446 9.
  • 8Elseidy M, Abdelhamid E, Skiadopoulos S, et al. GRAMI: Frequent subgraph and pattern mining in a single large graph [C] //Proc of the 40th Int Conf on Very Large Data Bases. Berlin: Springer, 2014:517-528.
  • 9Bhuiyan M A, A1 Hasan M. An iterative MapReduce based frequent subgraph mining algorithm [J]. IEEE Trans on Knowledge and Data Engineering, 2013, 27(3): 608-620.
  • 10Lu W, Chen G, Tung A K H, et al. Efficiently extracting frequent subgraphs using mapreduce [C] //Proc of the 1st IEEE Int Conf on Big Data. Piscataway, NJ: IEEE, 2013: 639-647.

共引文献20

同被引文献9

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部