摘要
随着网络技术的发展,网络数据量正以指数级增长且规模日渐庞大。面对正在增长的海量数据,传统的数据处理方法存在效率低下等诸多缺点。人们需要一种新的技术思想来解决这些问题。因此,云计算的思想被提出。云计算是一种新兴的计算模型,是分布式计算技术的一种。而Hadoop作为一个开源的分布式平台是当前最为流行的云计算平台实现之一,被用于高效地处理海量数据。为了提高对海量数据处理的效率,文中首先简要分析了云计算的概念和Hadoop主要组件的工作流程,然后详细介绍了基于Hadoop的云计算平台配置方法和实现过程,并对云平台的搭建过程中遇到的典型问题进行了总结阐述。最后通过实验证明,该平台可以有效地完成分布式数据处理任务。
With the development of network technology,the number of online information is increasing in exponential and becoming larger and larger. With the growing amount of data,the traditional methods for processing massive data have many shortcomings like lowefficiency. A novel technology is needed to solve these problems,so the cloud computing has been brought. It is an emerging computational model,as a kind of distributed computing technology. Hadoop is one of the most popular cloud computing platforms as a kind of open sources distributed platform,which is always applied on the area that needs to handle massive data efficiently. In order to improve the efficiency of processing massive data,it briefly analyzes the concept of cloud computing and the work flowof the main components of Hadoop in this paper,then introduction of the implementation method of the cloud computing platform based on Hadoop in detail,discussion of the typical problems encountered in the process of building cloud computing platform. Finally,the experiments showthat the platform can effectively complete the processing tasks of distributed data.
出处
《计算机技术与发展》
2016年第7期127-132,共6页
Computer Technology and Development
基金
国家自然科学基金资助项目(61202098)