期刊文献+

基于改进预测强度的大数据K-均值聚类方法 被引量:1

A Large Data Clustering Method Based on Improved Prediction Strength
下载PDF
导出
摘要 为了降低偶然因素的影响,提出了一种基于改进预测强度的大数据K-均值聚类方法,其基本思想是:首先将数据集若干等分,每一等分轮流作为测试集,取其平均预测强度,然后根据预测强度确定聚类数和聚类变量,再用K-均值聚类方法对数据集进行聚类。用上述方法研究了访客在某网站各栏目的平均停留时间,结果表明,基于预测强度的聚类方法较常规聚类方法更适宜于大数据的聚类分析。 In order to reduce the influence of accidental factor,a large data K-means clustering method based on improved prediction strength is put forward.The basic idea of method is that first data set is divided into equal parts,and each part is set up test set in turn.The average strength prediction is computed,and clustering number is determined according to the strength prediction,then K-means clustering method is applied for data set.By means of the above method,the average residence time of the visitors in a website is studied.The results show that the clustering method based on the prediction strength is more suitable for the cluster analysis of large data.
作者 蔡洪山 许峰
出处 《软件导刊》 2016年第5期4-6,共3页 Software Guide
基金 安徽省教育厅自然科学基金项目(2014KB236)
关键词 大数据 K-均值聚类 预测强度 网站栏目关注度 Big Data K-Means Clustering Prediction Strength Website Column Access Analysis
  • 相关文献

参考文献5

二级参考文献45

  • 1胡建军,唐常杰,段磊,左劼,彭京,元昌安.基因表达式编程初始种群的多样化策略[J].计算机学报,2007,30(2):305-310. 被引量:44
  • 2Mac Queen J. Some Methods for Classification and Analysis of Multivariate Observations[J]. Proceeding of the 5th Berkeley Symposium on Mathematics Statistic Problem, 1967, (1).
  • 3Huang Z. Extensions to The K-means Algorithm for Clustering Large Data Set with Categorical Values [J]. Data Mining and Knowledge Discovery,1998,(2).
  • 4Dubes R C,Jain A K.Validity Studies in Clustering Methodologies[J]. Pattern Recognition, 1979, 12(11).
  • 5Siddheswar Ray, Rose H. Tuff. Determination of Number of Clusters in K-Means Clustering and Application in Color Image Segmentation[J]. ICAPRDT'99, Calcutta,India,1999,(12).
  • 6Tsunenori Ishioka. Extended K-means with an Efficient Estimation of the Number of Clusters[J]. Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2000), Hong Kong, China, 2000.
  • 7Pal N R and J. C. Bezdek. On Cluster Validity for the Fuzzy cmeans Model[J]. IEEE Transaction on Fuzzy Systems,1995.
  • 8Moguerza J M, Munoz A, Martin-Merino M. Detecting the Number of Clusters Using a Support Vector Machine Approach[J]. International Conference on Artificial Neural Networks-ICANN,2002.
  • 9Von Luxburg U. A tutorial on spectral clustering[R]. TR-149. Max Planck Institute for Biological Cybernetics, 2006.
  • 10Shi J, Malik J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelli- gence,2000,22(8) :888-905.

共引文献17

同被引文献9

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部