摘要
"密度-距离"快速搜索聚类算法的核心思想是:聚为一类的核心节点的密度最大,核心节点与其他密度更大的节点之间的距离相对更大。为丰富文献计量学的方法体系,本文将该算法引入到共词聚类分析中。以"学科服务"为研究主题,利用Bicomb形成共词矩阵,在Matlab环境中将其转换为三元组相似距离表,最后利用"密度一距离"快速搜索聚类算法将学科服务研究主题自动确定为5个研究类团,并给出了对应的类中心、实现了聚类结果的可视化。与已有工具软件(如SPSS、Ucinet、Citespace)内嵌的聚类算法的聚类效果相比,本文方法最大的优势是不需要进行多次迭代,耗时少;自动确定聚类中心的类名、类团的数目等;而且聚类结果理想,可视化效果较好。
In this paper, we propose a new co-word clustering analysis method using fast search Density-Distance clustering algorithm based on the idea that cluster centers are characterized by a higher densities than neighbors and by a relatively larger distance from points with higher densities. Taking the subject service as an example, we achieve the co-word matrix using the Bicomb, and convert it to a triple similar distance table in the matlab environment. According to the clustering algorithm, we automatically detect five research clusters, the corresponding class center, and realizes the visualization of clustering results. Through comparing with the existing clustering about the subject service theme we find that the algorithm has some unique advantages in the co-word clustering: less time-consuming since no iteration is required, the type of cluster center, cluster number and cluster name are automatically determined, the clustering results are easy to understand, better visualization effect are achieved.
出处
《情报学报》
CSSCI
北大核心
2016年第4期380-388,共9页
Journal of the China Society for Scientific and Technical Information
基金
国家社会科学基金资助项目"大数据时代图书馆服务体系的创新与发展研究"的阶段性成果之一
项目编号:15BTQ023
关键词
密度-距离
快速搜索
聚类分析
共词聚类
density-distance algorithm, fast search, clustering analysis, co-word clustering