期刊文献+

一种改进的KNN网页分类算法 被引量:3

An Improved KNN Algorithm for Classification
下载PDF
导出
摘要 针对KNN算法懒惰分类和效率不高的特点,将训练数据集进行优化,提取有代表性的训练样本作为代表样本,用其代替整个训练集进行相似度比较。实验结果表明,使用代表样本集的分类性能与传统KNN算法的性能相当,缩短了分类时间,提高了分类效率,并且不需要估计K值,减少了人工估计值的偏差。 This paper proposed an improved method to solve the problem of the KNN. algorithm that the classification process of the KNN costs too much time so that it does fit for web online classification. This improved method based on datasets optimization aims to generate best sample dataset instead of original datasets. The result of the experiments shows that it can improve efficiency of the KNN classification.
作者 李村合 冯静
出处 《微计算机应用》 2008年第3期21-25,共5页 Microcomputer Applications
关键词 网页分类 KNN 代表样本 数据集优化 相似度 webpage classification, KNN, datasets, optimization, samples, similarity
  • 相关文献

参考文献8

二级参考文献19

  • 1赵国涛,何钦铭.基于本体的异构文本分类系统[J].计算机工程,2004,30(21):123-125. 被引量:4
  • 2吴军,王作英,禹锋,王侠.汉语语料的自动分类[J].中文信息学报,1995,9(4):25-32. 被引量:24
  • 3陈立孚,周宁,李丹.基于机器学习的自动文本分类模型研究[J].现代图书情报技术,2005(10):23-27. 被引量:9
  • 4卜东波.聚类/分类理论研究及其在文本挖掘中的应用.中科院计算所博士学位论文[M].-,2000..
  • 5YANG Y,LIN X.A re-examination of text categorization methods[A].The 22th Annual Int ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'99)[C].New York:ACM Press,1999.42-49.
  • 6LEWIS DD.Naive (Bayes) at forty:The independence assumption in information retrieval[A].The 10th European Conf on Machine Learning(ECML98)[C].New York:Springer-Verlag,1998.4-15.
  • 7SEBASTIANI F.Machine learning in automated text categorization[J].ACM Computing Surveys,2002,34(1):1-47.
  • 8JOACHIMS T.Text categorization with support vector machines:Learning with many relevant features[A].The 10th European Conf on Machine Learning(ECML-98)[C].Berlin:Springer,1998.137-142.
  • 9ZHOU SG,LING TW,GUAN JH,et al.Fast text classification:a training-corpus pruning based approach[A].Proceedings of the 8th International Gonference on Database Systems for Advanced Application(DASFAA 2003)[G].IEEE GS,March 26 -28,Kyoto,Japan,2003.127-136.
  • 10Dong Y an Shi,Han Ke Song.A Comparison of Several Ensemble Methods for Text Categorization[C] // Proceedings of the 2004 IEEE International Conference on Service Computing.[s.l.]:IEEE Computer Society,2004.

共引文献63

同被引文献22

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部