摘要
针对核K-means算法初始聚类中心点难以确定等问题,提出了一种基于局部密度的核K-means算法,该方法利用每个样本的局部相对密度来选择具有高密度且低相似性的样本来生成初始类中心点。实验结果表明,该算法能够很好地排除类边缘点和噪声点的影响,并且能够适应数据集中各个实际类别密度分布不平衡的情况,最终可以生成质量较高且波动性较小的聚类。
In order to solve the problem that original clustering centers of kernel K-means algorithm is difficult to determine, proposed a kernel K-means clustering algorithm based on local density(LDKK). This algorithm applied local relative density of each data to choose the points with high density and low similarity as the initial cluster centers. Experimental results show that the algorithm can eliminate the impact of edge points and noise points, and adapt to the imbalance of each actual type of data set in the density distribution, which can eventually generate higher quality and less volatility clustering.
出处
《计算机应用研究》
CSCD
北大核心
2011年第1期78-80,90,共4页
Application Research of Computers
基金
江苏省"青蓝工程"资助项目
江苏省六大人才高峰资助项目(07-E-025)
江苏省高校自然科学重大基金研究资助项目(08KJA520001)
国家中小企业创新基金资助项目(09C26213203797)
国家自然科学基金资助项目(70971067)