高度可伸缩的稀疏矩阵乘法被引量：3

Highly Scalable Sparse Matrix Multiplication

下载PDF

导出

摘要矩阵乘法是线性代数和图算法中非常重要的一个基本操作,而大规模数据处理中的矩阵往往是稀疏矩阵。MapReduce编程框架能够有效地支持海量数据的分布式计算。因此,对如何运用MapReduce编程框架实现超大规模稀疏矩阵的乘法进行了研究。传统矩阵乘法并行算法没有针对稀疏矩阵进行专门优化,导致计算过程中出现大量不必要的通信开销。提出了一种新的算法——CRM(column row multiplication)算法,并与传统的矩阵分块算法进行了比较。实验证明,CRM算法运行效率有很大的提高,并且具有高度的可伸缩性,适合在MapReduce平台上运行。 Matrix multiplication is an important fundamental operation in algebra and graph algorithms. And matrixes are usually highly sparse when coming to massive data processing. MapReduce is a programming model which can process large data sets effectively. This paper focuses on how to deal with massive sparse matrix multiplication on top of MapReduce programming model. Block based matrix multiplication algorithms aren＇ t optimized for sparse matrix and produce large amount of redundant communication. This paper proposes a new algorithm named CRM （column row multiplication）, and compares it with traditional block based matrix algorithms. The experimental results demonstrate that CRM has higher efficiency and scalability, is suitable for operating on MapReduce and out- performs traditional ways considerably.

作者吴志川毛琛韩蕾陈立军

机构地区北京大学信息科学技术学院计算机系

出处《计算机科学与探索》 CSCD 2013年第11期973-982,共10页 Journal of Frontiers of Computer Science and Technology

基金国家自然科学基金~~

关键词稀疏矩阵乘法分布式计算 HADOOP sparse matrix multiplication distributed computing Hadoop

分类号 TP319 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献15

1Shah V B, Adviser-Gilbert J R. An interactive system for combinatorial scientific computing with an emphasis on programmer productivity[M]. Santa Barbara: University of California at Santa Barbara, 2007.
2Rabin M 0, Vazirani V V. Maximum matchings in general graphs through randomization[J]. Journal of Algorithms, 1989, 10(4): 557-567.
3Yuster R, Zwick U. Detecting short directed cycles using rectangular matrix multiplication and dynamic program?ming[C]/lProceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '04). Philadel?phia, PA, USA: Society for Industrial and Applied Mathe?matics, 2004: 254-260.
4Zhang Yudong, Wu Lenan, Wei Geng, et al. A novel algo?rithm for all pairs shortest path problem based on matrix multiplication and pulse coupled neural network[J]. Digital Signal Processing, 2011, 21(4): 517-521.
5Qian Zhengping, Chen Xiuwei, Kang Nanxi, et al. Mad?LINQ: large-scale distributed matrix computation for the cloud[C]/lProceedings of the 7th ACM European Confer?ence on Computer Systems (EuroSys '12), Bern, Switzer?land, 2012: 197-210.
6Loos S M, Wise D S. Strassen' s matrix multiplication rela?beled[EB/OL]. (2009)[20 13-03]. http://src.acm.org/loos/loos. html.
7Kang U, Tsourakakis C E, Faloutsos C. PEGASUS: mining peta-scale graphs[J]. Knowledge and Information Systems, 20 II, 27(2): 303-325.
8Seo S, Yoon E J, Kim J, et al. HAMA: an efficient matrix computation with the MapReduce framework[C]//Proceedings of the IEEE 2nd International Conference on Cloud Com?puting Technology and Science (CloudCom ' 10). Wasbing?ton, DC, USA: IEEE Computer Society, 2010: 721-726.
9Saad Y. Iterative methods for sparse linear systems[M). Phila?delphia, PA, USA: Society for Industrial and Applied Mathe?matics, 2003.
10Park S C, Draayer J P, Zbeng S Q. Fast sparse matrix multi?plication[J]. Computer Physics Communications, 1992, 70(3): 557-568.

同被引文献35

1周军锋,汤显,郭景峰.一种优化的协同过滤推荐算法[J].计算机研究与发展,2004,41(10):1842-1847. 被引量：103
2李国杰.大数据研究的科学价值[J].中国计算机学会通讯,2012,8(9):8-15.
3新浪微博用户超5亿[EB/OL].[2013-02-21].http://news.xinhuanet.com/info/2013-02/21/c__132181760.htm.content8736757.htm.
4Jeh G, Widom J. SimRank: a measure of structural-context similarity[C]//Proceedings of the 8th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining (KDD '02), Strathcona, USA, 2002. New York, NY, USA: ACM, 2002: 538-543.
5Page L, Brin S, Motwani R, el al. The pagerank citation ranking: bringing order to the Web[R/OL]. Stanford Univer- sity Database Group (1998)[2014-03-10]. http://citeseer, nj. nec.com/368196.html.
6Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters[J]. Communications of ACM, 2004, 51(1): 107-113.
7Valiant L G. A bridging model for parallel computation[J]. Communications of the ACM, 1990, 33(8): 103-111.
8Malewicz G, Austern M H, Bik A J, et al. Pregel: a system for large-scale graph processing[C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10), Ashraf Aboulnaga, 2010. New York, NY, USA: ACM, 2010: 135-146.
9Yu Weiren, Lin Xuemin, Zhang Wenjie. Towards efficient SimRank computation on large graphs[C]//Proceedings of the 29th International Conference on Data Engineering (ICDE ' 13), Brisbane, Australia, 2013. Piscataway, N J, USA: IEEE, 2013: 601-612.
10Li Cuiping, Han Jiawei, He Guoming, et al. Fast computa- tion of SimRank for static and dynamic information net- works[C]//Proceedings of the 13th International Confer- ence on Extending Database Technology (EDBT '10), Laus- anne, Switzerland, 2010. New York, NY, USA: ACM, 2010: 465-476.

引证文献3

1刘恒,寇月,申德荣,王泰明,于戈.基于随机游走路径的分布式SimRank算法[J].计算机科学与探索,2014,8(12):1422-1431. 被引量：2
2应毅,刘亚军,陈诚.基于云计算技术的个性化推荐系统[J].计算机工程与应用,2015,51(13):111-117. 被引量：24
3杨飞,马昱春,侯金,徐宁.基于MPSoC并行调度的矩阵乘法加速算法研究[J].计算机科学,2017,44(8):36-41. 被引量：4

二级引证文献30

1张培爱.环路的平均固定时间研究[J].烟台大学学报（自然科学与工程版）,2015,28(4):246-248.
2黄震华,张佳雯,田春岐,孙圣力,向阳.基于排序学习的推荐算法研究综述[J].软件学报,2016,27(3):691-713. 被引量：108
3李晨跃,刘克剑,孟庆瑞.面向博物馆的个性化推荐系统设计与实现[J].软件导刊,2016,15(5):66-68. 被引量：3
4谭阳,陈琳,刘艳,宁可.基于在线数据过滤提取的高职学生网上学习行为调查与分析[J].中国教育信息化,2016,22(11):26-30. 被引量：4
5田保军,胡培培,杜晓娟,苏依拉.Hadoop下基于聚类协同过滤推荐算法优化的研究[J].计算机工程与科学,2016,38(8):1615-1624. 被引量：18
6张警灿,王缓缓.基于商品特征属性的个性化实时推荐系统研究[J].软件导刊,2016,15(10):123-125. 被引量：2
7姜少鑫.探讨基于云计算技术的个性化推荐系统[J].中小企业管理与科技,2016(32):166-167. 被引量：1
8梁向阳,张博伦.协同过滤推荐技术归类分析与探讨[J].计算机与现代化,2016(12):67-72. 被引量：2
9刘辉,郭梦梦,潘伟强.个性化推荐系统综述[J].常州大学学报（自然科学版）,2017,29(3):51-59. 被引量：38
10林穗,郑志豪.基于关联规则的客户行为建模与商品推荐研究[J].广东工业大学学报,2018,35(3):90-94. 被引量：4

1郑建华,朱蓉,沈玉利.基于MapReduce的稀疏矩阵乘法算法(英文)[J].仲恺农业工程学院学报,2013,26(3):45-50.
2袁加全,陈勇.基于JAVA中的多线程技术实现并行计算[J].仪器仪表用户,2005,12(1):106-108. 被引量：3
3张怀遵.稀疏矩阵的乘法在DBASE—Ⅲ中的实现[J].电脑应用时代,1990(4):48-50.
4鲍黎,陈庆利.基于三元组的稀疏矩阵乘法运算的改进[J].乐山师范学院学报,2015,30(8):11-13.
5廖松博,何震瀛.HDCH:MapReduce平台上的音频数据聚类系统[J].计算机研究与发展,2011,48(S3):472-475. 被引量：3
6田翔,周凡,陈耀武,刘莉,陈耀.基于以太网的多FPGA矩阵乘法并行计算系统设计(英文)[J].仪器仪表学报,2007,28(8):1373-1377. 被引量：5
7田翔,周凡,陈耀武,刘莉,陈耀.基于FPGA的实时双精度浮点矩阵乘法器设计[J].浙江大学学报（工学版）,2008,42(9):1611-1615. 被引量：21
8何敏,武德安,吴磊.基于MapReduce的平均多项朴素贝叶斯文本分类[J].计算机应用研究,2016,33(1):115-117. 被引量：5
9Teradata迎战大数据时代[J].通讯世界,2011(10):78-78.
10钱泉.基于MapReduce的聚集查询性能优化[J].信息与电脑（理论版）,2014,0(7):86-88.

计算机科学与探索

2013年第11期

浏览历史

内容加载中请稍等...

高度可伸缩的稀疏矩阵乘法被引量：3

参考文献15

同被引文献35

引证文献3

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

高度可伸缩的稀疏矩阵乘法 被引量：3

参考文献15

同被引文献35

引证文献3

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

高度可伸缩的稀疏矩阵乘法被引量：3