期刊文献+

基于模糊近似度的Web文本过滤模型 被引量:2

The Feature Acquiring Algorithm on The Web Text
下载PDF
导出
摘要 The booming growth of the Internet provides us a great deal of information resource. In this paper, we create a text filtering model based on VSM. In this model,Web text mming is an efficient technique,which discoveres valuable and potential knowledge from those unstructured texts. In this paper,we use VSM as the description of Web text and give a feature subset algorithm which is based on the Genetic Algorthm. This algorithm can greatly improve the efficiency of dealing with Web texts and give much better way to classify and cluster the texts. Our experiments show that this method is active well in feature dimension reduction. The booming growth of the Internet provides us a great deal of information resource. In this paper, we create a text filtering model based on VSM. In this model, Web text mining is an efficient technique,which discoveres valuable and potential knowledge from those unstructured texts. In this paper, we use VSM as the description of Web text and give a feature subset algorithm which is based on the Genetic Algorthm. This algorithm can greatly improve the efficiency of dealing with Web texts and give much better way to classify and cluster the texts. Our experiments show that this method is active well in feature dimension reduction.
出处 《计算机科学》 CSCD 北大核心 2001年第12期55-58,共4页 Computer Science
基金 天津自然科学基金(003700111)和(993600811)
关键词 WWW WEB 文本过滤模型 模糊近似度 INTERNET 数据库 VSM,Text filtering,Genetic algorithm,Text mining,KDD
  • 相关文献

参考文献5

二级参考文献15

  • 1尹锋,林亚平.汉语自动分词技术的现状及发展趋势[J].软件世界,1996(12):80-84. 被引量:15
  • 2卢宏恩,计算机科学,1996年,23卷,6期
  • 3王春雷,中国教育和科研计算机网的研究与发展.1,1996年
  • 4Liu J,Understanding WWW Search Tools
  • 5刘东立,东北大学学报,1995年
  • 6吴军,中文信息学报,1995年,9卷,4期
  • 7姚天顺,自然语言理解,1995年
  • 8刘湘生,中国分类主题词表,1994年
  • 9梅家驹,同义词词林,1983年
  • 10Chen M S,ICDCD,1998年,385页

共引文献462

同被引文献14

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部