期刊文献+

基于类别平均距离的加权KNN分类算法 被引量:12

Weighted KNN Classification Algorithm Based on Mean Distance of Category
下载PDF
导出
摘要 本文提出了一种改进的KNN分类算法,利用样本集合中同类别样本点间距离都十分接近的特点辅助KNN算法分类.将待分类样本点的K个最近邻样本点分别求出样本点所属类别的类别平均距离和样本点与待分类样本点距离的差值比,如果大于一个阈值,就将该样本点从K个最近邻的样本点中删除,再用此差值比对不同类别的样本点个数进行加权后执行多数投票,来决定待分类样本点所属的类别.改进后的KNN算法提高了分类的精度,并且时间复杂度与传统KNN算法相当. In this paper, an improved KNN classification algorithm is proposed by using characteristics that the points distributed in the same category of sample collection are in close distance as an assistant to classify KNN algorithm. The way to deal with the k-nearest neighboring sample points is calculating the average distance between categories that the sample points belong to and the differences of unspecified sample points respectively. If the data calculated is greater than a certain threshold, delete this sample point from k-nearest neighboring samples, then determine the categories of unspecified sample points through majority voting. The improved KNN algorithm enhances the precision of classification and maintains the same time complexity as the traditional KNN algorithm.
作者 严晓明
出处 《计算机系统应用》 2014年第2期128-132,共5页 Computer Systems & Applications
基金 福建省教育厅B类基金(JB11036)
关键词 类别平均距离 KNN 加权算法 mean distance of category KNN weighted algorithm
  • 相关文献

参考文献7

  • 1Cover T, Hart P. Nearest neighbor pattern classification.IEEE Trans. on Information Theory, 1967, 13: 21-27.
  • 2Hart P. The condensed nearest neighbor rule. IEEE Trans. on Information Theory, 1968, 14(3): 515-516.
  • 3Devijver P, Kittler J. Pattern Recognition: A Statistical Approach. Englewood Cliffs: PrenticeHall, 1982.
  • 4李荣陆,胡运发.基于密度的kNN文本分类器训练样本裁剪方法[J].计算机研究与发展,2004,41(4):539-545. 被引量:98
  • 5Goldberger J, Roweis S, Hinton G, Salakhutdinov R. Neighborhood components analysis. Proc. of the Advances in Neural Information Processing Systems. Vancouver. Canada, MIT Press. 2004.512-520.
  • 6Torresani L, Lee K. Large margin component analysis. Proc. of the Advances in Neural Information Processing Systems. Vancouver. Canada, MIT Press. 2007. 1385-1392.
  • 7崔正斌,汤光明.基于遗传算法和KNN的软件度量属性选择研究[J].计算机工程与应用,2010,46(30):57-60. 被引量:7

二级参考文献21

  • 1王琪.软件质量预测模型中的若干关键问题研究[D].上海:上海交通大学.2006.
  • 2[1]D D Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In: The 10th European Conf on Machine Learning(ECML98), New York: Springer-Verlag, 1998. 4~15
  • 3[2]Y Yang, X Lin. A re-examination of text categorization methods. In: The 22nd Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval, New York: ACM Press, 1999
  • 4[3]Y Yang, C G Chute. An example-based mapping method for text categorization and retrieval. ACM Trans on Information Systems, 1994, 12(3): 252~277
  • 5[4]E Wiener. A neural network approach to topic spotting. The 4th Annual Symp on Document Analysis and Information Retrieval (SDAIR 95), Las Vegas, NV, 1995
  • 6[5]R E Schapire, Y Singer. Improved boosting algorithms using confidence-rated predications. In: Proc of the 11th Annual Conf on Computational Learning Theory. Madison: ACM Press, 1998. 80~91
  • 7[6]T Joachims. Text categorization with support vector machines: Learning with many relevant features. In: The 10th European Conf on Machine Learning (ECML-98). Berlin: Springer, 1998. 137~142
  • 8[7]S O Belkasim, M Shridhar, M Ahmadi. Pattern classification using an efficient KNNR. Pattern Recognition Letter, 1992, 25(10): 1269~1273
  • 9[8]V E Ruiz. An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognition Letter, 1986, 4(3): 145~147
  • 10[9]P E Hart. The condensed nearest neighbor rule. IEEE Trans on Information Theory, 1968, IT-14(3): 515~516

共引文献103

同被引文献108

引证文献12

二级引证文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部