摘要
介绍了基于向量空间模型(VSM)中的KNN文本分类方法,分析了KNN方法的实质,指出了该方法的不足,基于文本属性关联和概念共现对KNN分类中的文档相似性度量公式提出了改进.分类实验结果表明,分类准确率平均提高了10%.
Based on the vector space model (VSM) in the KNN text classification methods,this paper first analysed the kNN" s physical meanings in the VSM and its weakness, then put forward an improved method, which is based on text attribute association and concept cooccurring. Results of experimental show that the accuracy ratio is improved by 10%.
出处
《中原工学院学报》
CAS
2009年第4期27-29,共3页
Journal of Zhongyuan University of Technology
基金
河南省教育厅软科学项目(2008b520046)
关键词
文本分类
KNN
向量模型
相似度
web page classification
KNN
vector model
degree of similarity