摘要
高维数据受冗余数据和噪声数据的影响,聚类效率和准确率低,基于拉普拉斯矩阵的特征值和特征向量的特点,介绍了一种适用于高维数据的新的聚类中心选择算法,算法将拉普拉斯矩阵用于候选聚类中心选择前的数据降维处理,经过对数据进行降维处理,提高了候选聚类中心的准确性,增大了聚类准确率,扩大了聚类数据的种类范围.在10个包含不同数量样本、维度、类别数的数据集上进行了聚类分析,实验结果表明了基于拉普拉斯降维的新聚类中心选择方法的有效性.
High-dimensional data is affected by redundant data and noise data,and the clustering efficiency and accuracy are low.Based on the characteristics of eigenvalues and eigenvectors of Laplacian matrix,a new algorithm for cluster center selection is introduced.The algorithm is suitable for high-dimensional data set.Laplacian matrix is used for data set dimension reduction before the selection of the candidate cluster center.After the dimensionality reduction of the data set,the accuracy of the candidate cluster center is improved,and the clustering accuracy is increased.The types of clustering data has been enriched.Cluster analysis was carried out on ten data sets containing different numbers of samples,dimensions and categories.The experimental results have justified the effectiveness of the new cluster center selection algorithm based on Laplacian matrix dimension reduction.
作者
刘颖
张艳邦
LIU Ying;ZHANG Yanbang(College of Mathematics & Information,Xianyang Normal University,Xianyang 712000,China)
出处
《天津科技大学学报》
CAS
2019年第3期76-80,共5页
Journal of Tianjin University of Science & Technology
基金
国家自然科学基金资助项目(61501388)