摘要
分类是数据挖掘的重要组成部分,它根据类标号已知的数据建立模型,进而使用该模型来预测类标号未知的数据所属的类。KNN方法作为一种简单、有效、非参数的分类方法,在文本分类中得到广泛的应用,但是这种方法在训练样本的分布不均匀时会造成分类准确率的下降。针对KNN方法存在的这个问题,本文提出了一种基于相对距离的KNN分类方法,这种方法减少了边界点处测试样本的误判。实验结果显示,这种方法具有很好的性能。
Classification is an essential part of data mining. It builds a model according to the data whose class labels are known, and then uses this model to predict the classes of the data whose class labels are unknown. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. But KNN classifier may decrease the precision of classification because of the uneven density of training data. In this paper, a relative-distance method which solves problem mentioned above is presented. It decreases the wrong classification between the edge of classes. The experiment also shows that it has good performance.
出处
《价值工程》
2014年第2期180-182,共3页
Value Engineering