集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模...集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模型,通过主成分法,产生群组各异的集群数据,使用K-means聚类方法对集群数据群聚类。随机模拟用因子分析模型主成分法产生集群数据,模拟表明了聚类方法的有效性。实例分析对集群数据群进行聚类,使用轮廓系数对聚类进行评价。评价结果表明,运用机器学习K-means算法对集群数据子群聚类效果较好。Cluster data characterizes the dynamic relationships among different research objects within a cluster, and is widely used in fields such as economics, society, and medicine. Classic clustering analysis methods are commonly used to characterize the similarity between samples and cluster samples or indicators, but there is relatively little research on clustering between subgroups of cluster data. This article establishes a factor analysis model for cluster data, generates cluster data with different groups through principal component analysis, and uses K-means clustering method to cluster the cluster data. Random simulation uses factor analysis model principal component analysis to generate cluster data, and the simulation shows the effectiveness of the clustering method. Case analysis is used to cluster data groups and evaluate the clustering using silhouette coefficients. The evaluation results indicate that the use of machine learning K-means algorithm has a good clustering effect on subgroups of cluster data.展开更多
文摘集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模型,通过主成分法,产生群组各异的集群数据,使用K-means聚类方法对集群数据群聚类。随机模拟用因子分析模型主成分法产生集群数据,模拟表明了聚类方法的有效性。实例分析对集群数据群进行聚类,使用轮廓系数对聚类进行评价。评价结果表明,运用机器学习K-means算法对集群数据子群聚类效果较好。Cluster data characterizes the dynamic relationships among different research objects within a cluster, and is widely used in fields such as economics, society, and medicine. Classic clustering analysis methods are commonly used to characterize the similarity between samples and cluster samples or indicators, but there is relatively little research on clustering between subgroups of cluster data. This article establishes a factor analysis model for cluster data, generates cluster data with different groups through principal component analysis, and uses K-means clustering method to cluster the cluster data. Random simulation uses factor analysis model principal component analysis to generate cluster data, and the simulation shows the effectiveness of the clustering method. Case analysis is used to cluster data groups and evaluate the clustering using silhouette coefficients. The evaluation results indicate that the use of machine learning K-means algorithm has a good clustering effect on subgroups of cluster data.