摘要
针对互联网的数据挖掘在"棱镜"计划中扮演着至关重要的角色。文中首先对数据挖掘的基本技术原理进行了分析,包括关联分析和机器学习的常用算法。然后介绍了互联网信息检索和挖掘的主要技术。接下来提出了一种基于开源云计算平台的互联网大数据挖掘系统架构。最后,对互联网大数据挖掘的发展指出了方向。
For the data mining for Web plays a key role in" Lens" project, this paper first analyzes the basic technical principles of data mining, including the general algorithms of association analysis and machine learning, then describes the main techniques of information retrieval and mining for Web, proposes the architecture for Web big-data mining system based on open-source cloud computing platform,including an algorithm for computing resource scheduling in cloud, and finally, the direction of technology development for Web mining is pointed out.
出处
《信息安全与通信保密》
2013年第12期87-91,95,共6页
Information Security and Communications Privacy
关键词
数据挖掘
信息检索
云计算
data mining
information retrieval
cloud computing