摘要
pSCAN算法的聚类结果受密度约束参数和相似度阈值参数的影响,如果用户提供的聚类参数得到的聚类结果无法满足需求,那么用户可以通过实例簇表达自己的聚类需求。针对实例簇表达聚类查询需求的问题,提出一种实例簇驱动的图结构聚类参数计算算法PART及其改进算法ImPART。首先,分析两个聚类参数对聚类结果的影响,并提取实例簇的相关子图;其次,对相关子图进行分析得到密度约束参数的可行区间,并根据当前密度约束参数和节点之间的结构相似度将实例簇内节点划分为核心节点和非核心节点;最后,依据节点划分结果计算出当前密度约束参数对应的最优相似度阈值参数,并在相关子图上对得到的参数进行验证和优化,直到得到满足实例簇需求的聚类参数。在真实数据集上的实验结果表明,所提算法能够为用户实例簇返回一组有效参数,且所提改进算法ImPART的运行时间比PART缩短了20%以上,能够快速有效地为用户返回满足实例簇要求的最优聚类参数。
Clustering results of the pSCAN(pruned Structural Clustering Algorithm for Network)algorithm are influenced by the density constraint parameter and the similarity threshold parameter.If the requirements cannot be satisfied by the clustering results obtained by the clustering parameters provided by the user,then the user’s own clustering requirements can be expressed through instance clusters.Aiming at the problem of instance clusters expressing clustering query requirements,an instance cluster-driven structural graph clustering parameter calculation algorithm PART and its improved algorithm ImPART were proposed.Firstly,the influences of two clustering parameters on the clustering results were analyzed,and correlation subgraph of instance cluster was extracted.Secondly,the feasible interval of the density constraint parameter was obtained by analyzing the correlation subgraph,and the nodes in the instance cluster were divided into core nodes and non-core nodes according to the current density constraint parameter and the structural similarity between nodes.Finally,according to the node division result,the optimal similarity threshold parameter corresponding to the current density constraint parameter was calculated,and the obtained parameters were verified and optimized on the relevant subgraph until the clustering parameters that satisfy the requirements of the instance cluster were obtained.Experimental results on real datasets show that a set of effective parameters can be returned for user instance clusters by using the proposed algorithm,and the proposed improved algorithm ImPART is more than 20% faster than the basic algorithm PART,and can return the optimal clustering parameters that satisfy the requirements of instance clusters quickly and effectively for the user.
作者
宗传玉
宪超
夏秀峰
ZONG Chuanyu;XIAN Chao;XIA Xiufeng(School of Computer Science,Shenyang Aerospace University,Shenyang Liaoning 110136,China)
出处
《计算机应用》
CSCD
北大核心
2023年第2期398-406,共9页
journal of Computer Applications
基金
国家自然科学基金资助项目(61802268)。
关键词
图结构聚类
实例簇
参数计算
节点划分
最优参数
structural graph clustering
instance cluster
parameter calculation
node division
optimal parameter