摘要
针对粒子群优化(Particle Swarm Optimization,PSO)算法在维度高、特征稀疏的文本聚类过程中,随着算法迭代次数增加在后期陷入局部最优的问题,提出采用多样性更好的差分进化(Differential Evolution,DE)策略更新种群,尝试找到更好的全局最优解。考虑到种群个体间包含的聚类中心向量排列顺序的随机性对个体间的学习与更新的影响,提出一种自适应调整聚类中心向量排列顺序的方法,将个体间相似度最大的聚类中心向量尽可能排列在同一维度。通过在文本数据集上进行测试,验证了所提出的聚类中心排列调整差分进化粒子群(Index adaptive DEPSO,IDEPSO)算法在内部、外部指标上相对于其他现有算法的优势,证明了该算法的有效性和可行性。
In the process of text clustering with high dimension and sparse features,Particle Swarm Optimization(PSO)algorithm easily falls into the local optimization in the later stage with the increase of algorithm iterations.A Differential Evolution(DE)strategy with better diversity is added to update the population and try to find a better global optimal solution.Meanwhile,considering the influence of the randomness of the centroids order among individuals on learning and updating individuals,a method of the self-adaptive adjustment of the centroids order is proposed,by which the centroid with the maximum similarity between individuals will be listed in the same cluster index as much as possible.Finally,through the test on the text datasets,the advantages of the proposed clustering Index adaptive DEPSO(IDEPSO)algorithm are verified,compared with other existing algorithms in internal and external indicators,and the effectiveness and feasibility of the algorithm are proved.
作者
胡晓敏
王明丰
张首荣
李敏
HU Xiaomin;WANG Mingfeng;ZHANG Shourong;LI Min(School of Computers,Guangdong University of Technology,Guangzhou 510006,China;School of Information Engineering,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第4期61-67,共7页
Computer Engineering and Applications
基金
国家自然科学基金(61772142)
广东省自然科学基金面上项目(2019A1515011270)
广州市珠江科技新星项目(201806010059)
广东省信息物理融合系统重点实验室项目(2016B030301008)。