摘要
Internet信息量迅猛增长,信息的海量化越来越突出,如何获取用户所需已日益突显出其重要性。文本挖掘技术能快速、有效地从大量数据中抽取有价值的信息,而Internet成为一个拥有大量Web文本资源的巨型数据库,大量异构、非结构化的Web文本对数据挖掘技术提出新的挑战。介绍Web文本挖掘的一般流程,重点分析Web文本挖掘中的几种关键技术。
With a great scale popularization of Internet and improvement of the information of corporation, how to obtain these big capacities of users's information has been the important research subject. Technologies of text mining can quickly and effectively abstract the useful information from the many data. Internet has become a giant data of Web text document, but a great scale of Isomerism and instructure Web text produces new challenge to data mining. Introduces the process of Web text mining, emphatically analyzes the related technologies.
出处
《现代计算机》
2009年第3期109-111,127,共4页
Modern Computer