摘要
聚类分析是研究“物以类聚”的一种现代多元统计分析方法,而且聚类分析方法发展很快,并在经济、管理、地质勘探、天气预报、生物分类、考古学、医学、心理学以及制定国家标准和区域标准等许多方面都取得了很有成效的应用。本文首先重点学习了聚类分析的相关知识,通过对具体实例数据用SPSS软件进行不同种系统聚类法的应用分类,并利用阈值T、散点图和使用统计量确定适合的类的个数,把不同种系统聚类法进行研究和比较。最后得出结论:“给定一个阈值T”这种方法的主观性较强;“观测散点图”这个方法较为直观,效率也许会好于正规聚类方法;“使用统计量”往往更明确。在聚类方法的效果方面,类平均法和离差平方和法的聚类效果相对较好。
Cluster analysis is a kind of modern multivariate statistical analysis method to research"Things of one kind come together".And cluster analysis methods have developed rapidly,and made fruitful applications in economics,management,geological exploration,weather forecast,taxonomy,archaeology,medicine,psychology and national and regional standards development and so on.First,learning the relevant knowledge of cluster analysis,using the different kinds of system clustering method to classify the instance data through the application of SPSS software,and determining the suitable number of classes by using the threshold T,observing scatter plots and using statistics,to research and compare the different kinds of system clustering method.Finally come to the conclusion that:The deficiency of"Given a threshold value T"is its subjectivity;the method"Scatter plot"is more intuitive,and the efficiency may be better than the normal clustering method;"Using statistics"is often more clear.In the effect of clustering methods,in many applications,the clustering effect of the average linkage method and the Ward method is relatively good.
作者
陈婷婷
Tingting Chen(School of Science,China Univercity of Geosciences(Beijing),Beijing 100083,China)
关键词
聚类分析
分类
系统聚类法
SPSS
Cluster Analysis
Classification
System Clustering Method
SPSS