期刊文献+

基于Spark的大数据计算模型 被引量:5

The Spark-based Large Data Computing Model
下载PDF
导出
摘要 作为第三代机器学习工具,spark被视为替换Hadoop的下一代数据处理解决方案.包括了迭代计算、批处理计算、内存计算、流式计算、数据查询分析计算及图计算,提供了强大的内存计算引擎.Spark有望成为下一代大数据热门框架.研究分析了Spark组件生态圈和Lambda架构.最后介绍了Spark应用于机器学习领域. As the third generation of machine learning tools, spark is considered as the next generation of data processing solution to replace Hadoop, including the iterative calculation, batch calculation, memory computing, flow calculation, data query analysis and graph calculation. It provides a powerful memory computing engine.Spark, which is the next generation of big data popular framework.
作者 王磊 时亚文
出处 《电脑知识与技术(过刊)》 2016年第7X期7-8,共2页 Computer Knowledge and Technology
基金 陕西广播电视大学2015年度科研课题<信息技术与云计算技术研究>项目编号:15D-08-B08 陕西工商职业学院2015年度教革课题<大数据下的计算机类课程资源建设实践研究>项目编号:GJ1529
关键词 机器学习 SPARK HADOOP Machine learning spark Hadoop
  • 相关文献

参考文献3

  • 1http://baike.baidu.com/link?url=Nj Ue Voy Ti UBYeb THNOyw39VNZ1Yn9OMPz-SMujvalpe DTbcwu YNOQS5x RQttjvt Xa3m OO5Qd AI3Ho_H4dgsg8tyw Kzd Dg_w3ZURoi HOCYK7 .
  • 2胡俊,胡贤德,程家兴.基于Spark的大数据混合计算模型[J].计算机系统应用,2015,24(4):214-218. 被引量:56
  • 3Nathan Marz,James Warren.Big Data:Principles and BestPractices of Scalable Realtime Data Systems. . 2015

二级参考文献10

  • 1夏俊鸾,邵赛赛.Spark Streaming: 大规模流式数据处理的新贵. http://www.csdn.net/article/2014-01-28/2818282-Spark -Streaming-big-data. 2014.
  • 2Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008, 3(51-1): 107-113.
  • 3耿益锋,陈冠诚.Impala:新一代开源大数据分析引擎. http://www.csdn.net/article/2013-12-04/2817707-ImpalaBig- Data-Engine. 2013.12.
  • 4Strom. http://storm.incubator.apache.org/. 2014.
  • 5Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets: A fault-tolerant abstration for in-memory cluster computing. Proc. of the 9th USENIX Conference on NetWorked System Design and Implementation. 2012. 2-16.
  • 6Gonzalez J, Low Y, Gu H. PowerGraph: Distributed garph-p arallel computation on natural graphs. Proc. of the 10th USENIX Symposium on Operating Systems Design and Implementatin. 2012. 17-30.
  • 7Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I. Spark: Cluster Computing with Working Sets. Technical Report No. UCB/ EECS- 2010-53May 7, 2010.
  • 8Xin R, Rosen J, et al. Shark: SQL and Rich Analytics at Scale. Technical Report UCB/EECS. 2012.11.
  • 9Engle C, Lupher A, et al. Shark: Fast Data Analysis Using Coarse-grained Distributed Memory. SIGMOD 2012. May 2012.
  • 10Zaharia M, Das T, Li HY, Shenker S, Stoica I. Discretized streams: An efficient and fault-tolerant model for stream. Proc. on Large Clusters. HotCloud 2012. June 2012.

共引文献55

同被引文献42

引证文献5

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部