摘要
目前研究的热门领域Web数据挖掘是从WWW资源上抽取信息(或知识)的过程,是对Web资源中蕴含的、未知的、有潜在应用价值模式的提取。其一般的过程可表示为:信息的发现、信息的选择和预处理、分析过程、产生结果犤1犦。WEB上的数据收集是对WEB数据挖掘的一种支持技术,是WEB数据挖掘的第一步。该文提出了一种基于XML技术的WEB数据收集模型,并实现了其中的一些主要功能。同时针对模型系统的不足做了一些有意义的改进探索。
With the explosive growth of information sources available on the World Wide Web,it has become increas-ingly necessary for users to utilize automated tools in order to find,extract,filter,and evaluate the desired information and resources.Web mining has now been putting forward and been on wide research.It defined as the discovery and analysis of useful information from the world wide web,and the general process are:information discovering,information selecting,information pre-processing,analyzing and processing,and making result.The data-collection on web is the first step of the web mining.In this paper we propone a web data-collection model based on XML ,and take some functions into implementation.At last some valuable discussions are put forward on this model for its shortcomings.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第10期150-152,156,共4页
Computer Engineering and Applications