摘要
通过对传统K均值聚类(K-means)算法各种改进算法的学习与研究,针对离群点导致聚类结果效果不理想的问题,提出将离群点检测算法(LOF)与传统K-means算法相结合,首先利用离群点检测算法对数据集进行预处理并将离群点按一定比例筛选,然后用K-means算法对数据集进行分类,将未经LOF处理的分类结果与预处理后的结果进行对比。由实验仿真结果可知,提出的算法与传统K-means算法相比较,分类效果具有更大的类间距离和更小的类内距离,聚类结果更好。
Through studying and exploring various modified algorithms of traditional K-means clustering algorithm,aiming at the problem that the outliers lead to the unsatisfactory clustering results,this paper proposes to combine LOF(Local Outlier Factor)with the traditional K-means algorithm.Firstly,the local outlier factor algorithm is used to preprocess the data set and the outliers are filtered according to a certain ratio.Then,the data set is classified by using the K-means algorithm.The classification results without LOF treatment are compared with the pre-processed results.Experimental simulation results indicate that compared with the traditional K-means algorithm,the proposed algorithm has a larger inter-class distance and a smaller intra-class distance,and that its clustering result is better.
作者
杨红
李丹宁
王雅洁
YANG Hong;LI Dan-ning;WANG Ya-jie(College of Big Data and Information Engineering,Guizhou University,Guiyang Guizhou 550025,China;Guizhou Food Safety TestingApplication Engineering Technology Research Center Co.,Ltd.,Guiyang Guizhou 550022,China)
出处
《通信技术》
2019年第8期1884-1888,共5页
Communications Technology