摘要
初始聚类中心是指在聚类的过程中首次被选为中心的点或对象。针对传统的K-means算法由于随机选择初始聚类中心而造成的聚类结果不稳定的问题,提出PCA-AKM算法。该算法利用主成分分析方法提取数据集中的主要成分,实现数据降维,使用自定义指标密权值选择初始聚类中心,避免聚类中心局部最优问题。将该算法与K-means算法在UCI数据集上进行聚类对比,其聚类稳定性高于传统K-means算法。在KDD CUP99数据集上,对所提算法进行入侵检测仿真,实验结果证明该算法检测率高,误检率低,能够有效提高入侵检测的准确率。
The initial clustering center is the point or object selected for the first time in the clustering process.Aiming at the instability of clustering results in traditional K-means algorithm caused by choosing the initial clustering centers randomly,the PCA-AKM algorithm was proposed.The algorithm uses the principal component analysis to extract the main components of the data set to achieve data dimensionality reduction,and then uses the self-defined indicators Dw to choose the initial clustering centers,avoiding the clustering center local optimum.Comparison with the K-means algorithm on the UCI data set proves that the clustering stability of the PCA-AKM algorithm is higher than that of Kmeans.Experiment proves that the algorithm has high detection rate and low false detection rate on KDD CUP99 data set when it is used to simulate intrusion detection,and the algorithm can improve the accuracy of intrusion detection effectively.
出处
《计算机科学》
CSCD
北大核心
2018年第2期226-230,共5页
Computer Science
基金
"十二五"国家科技支撑计划基金项目(2014BAL04B06)资助
关键词
K均值算法
主成分分析
密权值
入侵检测
K-means algorithm
Principal component analysis
Dw
Intrusion detection