摘要
随着信息网络技术的发展,网站生成用户行为数据信息以每秒数以百G的数据量增长。数据特征不仅仅是数据量的庞大,而且包含结构化和非结构化的数据。为了从这些用户行为数据中挖掘出有用的信息,需要构建一个数据分析平台,更好地服务用户。Hadoop作为处理海量数据的主流框架,凭借其可靠、稳定、易扩展的优势,成为解决传统数据存储和数据分析的关键技术。主要阐述以Hadoop为主的数据分析系统的结构、设计思想和实现方法。
With the development of information network technology,website generated user behavior data information has increased by hundreds of G data per second.Data features are not just huge amounts of data,but also contain structured and unstructured data.In order to mine useful information from these user behavior data,it is necessary to build a data analysis platform to better serve users.As the mainstream framework for processing massive data,Hadoop has become a key technology for solving traditional data storage and data analysis by virtue of its reliability,stability and scalability.This paper mainly describes the structure,design ideas and implementation methods of Hadoop-based data analysis system.
出处
《工业控制计算机》
2019年第10期137-138,共2页
Industrial Control Computer
基金
南京理工大学科研启动费资助
江苏省自然科学基金(BK20180467)资助