摘要
提出了一种基于概率选择的K-means聚类算法,并将其应用到Spark平台进行图像聚类,得到的数据集远小于初始数据集,大大降低了算法的迭代次数,聚类速度非常快。在Spark平台应用改进的K-means算法进行岩石图像处理,对岩石图像进行特征提取,使得岩石图像易于区分,解决了传统的聚类算法无法确定初始中心、聚类数目K的选取不当可能导致聚类失败、算法容易受到噪声和孤立点影响等问题。
K-means clustering algorithm based on probability selection is discussed, and it is applied to the Spark platform for the clus- ter of images. The generated data set is much smaller than the initial data sets, which greatly reduces the iteration times and improves clustering speed. The improved K-means algorithm is applied to the processing of rock images. The feature of the rock image is extrac- ted ,which makes the rock image to be very easily distinguished. This method overcomes the shortcomings of the traditional clustering al- gorithms,such as clustering fails due to the unable determination of initial center and the improper selection of cluster number K, the cluster algorithm is susceptible to noise, isolated points, and so on.
出处
《西安石油大学学报(自然科学版)》
CAS
北大核心
2016年第6期114-118,共5页
Journal of Xi’an Shiyou University(Natural Science Edition)
基金
国家科技重大专项(编号:2016ZX05007G003)
陕西省工业科技攻关项目(编号:2015GY104)
中国石油天然气股份有限公司重大科技专项(编号:2011EG1301)