期刊文献+

一种基于密度的改进KNN文本分类算法 被引量:2

An Improved KNN Text Categorization Algorithm Based on Density
下载PDF
导出
摘要 KNN算法是一种应用广泛的人工智能算法,在文本分类应用中,简单有效,易于实现.但是,KNN分类的时间复杂度与训练样本数量成正比,而且,训练样本分布密度的不均匀性将导致分类准确性的下降.本文在KNN算法的基础上,提出一种改进算法.算法分析了训练样本的分布密度,通过裁减高密度区域训练样本,降低样本数量,调节训练样本分布,达到提高分类准确性的目的.实验证明,基于密度的改进KNN文本分类算法在降低时间复杂度的同时,还具有较好的准确率和召回率. The KNN algorithm is a widely used in artificial intelligence field. As a text categorization algorithm, it is simple,effectlve, and easy to implement. But the time complexity of KNN is directly proportional to the sample size. And the categorization accuracy will decrease in case of training samples uneven distribution. An improved KNN algorithm is proposed to improve the text categorization accuracy by adjusting training sample distribution. It analyzed and reduced the training samples in high distribution density areas. Experiments show that, the algorithm works with lower time complexity, also has better accuracy rate and r, ecall rate than common KNN in text classification.
出处 《漳州师范学院学报(自然科学版)》 2012年第2期45-48,共4页 Journal of ZhangZhou Teachers College(Natural Science)
关键词 K近邻 文本分类 样本裁减 KNN Text Categorization Sample Reduction
  • 相关文献

参考文献8

二级参考文献58

共引文献468

同被引文献14

引证文献2

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部