摘要
Web数据挖掘技术是近年来数据挖掘领域的研究重点之一。由于Web文档具有半结构化的特点,在执行具体的挖掘操作之前,对Web文档进行预处理是必不可少的。文章针对Web内容挖掘的预处理过程,提出一种以XML作为中介语言进行数据预处理的方法。
Web data mining technology is one of the hottest research topics in the field of data mining in recent years.Due to the characteristics of semi-structured Web document,the pre-processing is essential for it before executing specific mining operations.Aiming at the pre-processing process of Web content mining,we propose a pre-processing method that uses XML as an inter-language for data pre-processing.
出处
《计算机时代》
2011年第6期45-46,48,共3页
Computer Era