期刊文献+

K-means算法的初始值选取问题的研究

Research on Initial Value Selection of K-means Algorithm
下载PDF
导出
摘要 随着数据爆发式的增长,数据挖掘算法的使用更加频繁,因此选取合适的数据挖掘算法进行数据分析是非常有必要的。本文对确定K-means算法初始值的问题,提出了一种数据预处理的优化方案。通过对目标数据集进行Canopy算法处理,并对Canopy算法执行后的分组进行降噪、合并,以最终的分组个数作为K-means算法的分组K值,并以各分组的重心作为初始聚类重心,从而确定K-means算法的初始值。对比实验的结果显示,优化后的K-means算法具有更好的聚类效果。 As a fast-moving consumer product,cigar products are also limited by tobacco monopoly laws and mainly sold offline,which makes it difficult for consumers to obtain data and cannot meet the demand for cigar consumption insights.The available cigar consumption data mainly includes search data that generate demand for cigars and evaluation data after cigar consumption.This paper aims to build a cigar Data dictionary with insight into cigar consumption demand.The visualization results indicate that the search data for Great Wall cigars increased by 5%in 2022,providing insight into the growth in consumer demand for Great Wall cigars.The associated term"Cigar Express Network"has emerged,revealing the emergence of a new model for cigar purchase.After marketing personnel conducted promotional training for retail customers,the local delivery model increased by 37.6%year-on-year.
作者 姚蒙 何鹏程 YAO Meng;HE Pengcheng(Network information Management Center,Xinyang Vocational and Technical College,Xinyang,China,464000;Department of Mathematics and Information Engineering,Xinyang Vocational and Technical College,Xinyang,China,464000)
出处 《福建电脑》 2023年第7期57-61,共5页 Journal of Fujian Computer
关键词 数据挖掘 K-MEANS算法 Canopy算法 聚类中心 Data Mining K-means Algorithm Canopy Algorithm Cluster Center
  • 相关文献

参考文献5

二级参考文献49

共引文献65

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部