摘要
提出了一种有效的k近邻分类文本分类算法,即SPSOKNN算法,该算法利用粒子群优化方法的随机搜索能力在训练集中随机搜索,在搜索k近邻的过程中,粒子群跳跃式移动,掠过大量不可能成为k近邻的文档向量,并且去除了粒子群进化过程中粒子速度的影响,从而可以更快速地找到测试样本的k个近邻.通过验证算法的有效性表明,在查找k近邻相同时,SPOSKNN算法的分类精度高于基本KNN算法。
An efficient algorithm SPSOKNN is proposed to reduce the computational complexity of KNN text classification algorithm,it is based on particle swarm optimization which searches randomly within training document set.During the procedure for searching k nearest neighbors of tested sample,those document vectors that are impossible to be the k closest vectors are kicked out quickly.And removing PSO evolutionary process of particle velocity impact,thus we can more rapidly find the k closest vectors of test samples.By verifying the validity of algorithm,finding the same k nearest neighbors,classification accuracy of SPSOKNN algorithm is higher than KNN algorithm.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第32期57-59,共3页
Computer Engineering and Applications
基金
浙江省教育厅2006年度高校科研计划(No.20060347)。
关键词
K
近邻分类器
粒子群优化算法
相似度
K Nearest Neighbor(KNN) classifier
Particle Swarm Optimization(PSO)
similarity