期刊文献+

合并与不合并:两个相似性聚类分析方法比较 被引量:17

Comparison of merged and non-merged similarity clustering analysis methods
下载PDF
导出
摘要 以山西省4638种昆虫在7个地理小区的分布、内蒙古7766种昆虫在14个地理小区的分布和中国16804属昆虫在67个生态区域的分布3组数据为样本,用传统的层层合并的相似性聚类分析法(SCA)和新的不需合并的多元相似性聚类分析法(MSCA)进行运算分析,对比结果表明,不合并法都能得到既符合统计学逻辑,又符合地理学、生物学逻辑的结果;合并法在参与小区较少时,还能够得到与不合并法类似的结果,随着参与小区的增多,聚类结构发生变化,以致聚类功能彻底丧失。无论两种聚类结果差异大小,其性质都迥然不同:不合并法的相似性系数是固有的、互相独立的、同时存在的,聚类结果是所有小区之间关系亲疏、距离远近的状态;合并法的每个相似性系数都是合并的依据或结果,前一个系数是后一个系数产生的条件,后一个系数是前一个系数消亡的结果,严格按照顺序,当最后一个系数产生时,前面所有系数和所有小区都已不复存在,聚类结果只是记录不断合并、不断消亡的过程。因此在肯定合并法历史价值的同时,认为申效诚等创建的多元相似性系数公式及多元相似性聚类分析法摈弃合并降阶这一产生偏差和错误的根源,能够得出相对客观的聚类结果,是生物地理学研究领域有效的聚类分析工具,必将推动生物地理学定量研究迈入一个新阶段。 Distribution data of 4638 species in seven geographic regions of Shanxi Province were examined as a small sample,of 7766 species in 14 geographic regions of Inner Mongolia as a medium sample,and of 16804 genera in 67 ecological regions of China as a large sample.Statistical analyses of the three data groups were conducted separately,using a traditional merged method(similarity clustering analysis,SCA) and a new non-merged method(multivariate similarity clustering analysis(MSCA)).A critical comparison of the two methods demonstrates that the non-merged method can attain a result suitable for both logistics of biological statistics and geography,regardless of the scale of the data.The merged method(SCA) may achieve a result closely resembling that of the non-merged method when dealing with a fewer number of geographic regions.However,with an increased number of geographic regions,the clustering structure with the merged method may create a change at a different level — so much as to cause a complete loss of functionality.Regardless of the magnitude of difference between results of the two kinds of clustering,their nature will be totally different.The non-merged method similarity coefficients are inherent,independent of each other and exist simultaneously,the clustering result reflects the relationship and distance of all involved geographic regions,and all the coefficients are easily calculated with no strict orders.In the merged method,however,every coefficient was considered to be founded upon or be the result of clustering.The non-merged coefficient is the basis for the merged coefficient′s emergence,which is a result of the non-merged coefficient′s disappearance after merging.All of the calculations depend on input data and the deduced result is strictly in alphabetical order.It should be noted that the newest or final coefficients were worked out or generated,whereas the non-merged coefficients as well as the involved geographic regions had to be eliminated or discarded.The newest clustering coefficients were constantly generated,subsequently disappearing with the circulation.MSCA,in agreement with the value and huge contribution by SCA methods,can correct errors or inaccuracy that caused by merging or descending order during clustering by SCA method.It especially avoids some lost branches in the clustering result that are very important to the relationship,and cannot find any similarity level that requires indication in some detail.In summary,the MSCA method can solve many of the problems of the SCA method.The clustering achieves greater accuracy,which makes the results fit ecological reality.Also,our modified MSCA method can easily perform macroscopic clustering analysis of ecosystem data,which has never been completely accomplished before.
出处 《生态学报》 CAS CSCD 北大核心 2013年第11期3480-3487,共8页 Acta Ecologica Sinica
基金 河南省基础与前沿技术研究计划项目(河南省昆虫区系研究(082300430370)) 河南省重点实验室建设专项(河南昆虫地理分布及区划研究(112300413221))
关键词 多元相似性聚类分析 多元相似性系数 生物地理学 multivariate similarity clustering analysis multivariate similarity coefficients biogeography
  • 相关文献

参考文献8

二级参考文献132

共引文献219

同被引文献193

引证文献17

二级引证文献104

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部