摘要
在社交网络中,标签对资源的描述在一定程度上存在着准确性不高和结构紊乱等现象。为了改善这些问题,提出了一种新的基于权重与共现的标签凝聚型层次聚类算法:首先在社交网络中收集网页标签的相关数据,然后计算标签与网页的权重,再计算标签共现的相似度,并以此为初始数据进行凝聚型层次聚类,最后把聚类结果与人工分类结果比较,计算出精确度、召回率以及加权调和平均数F1。通过实验结果体现了这种算法的可行性。
ln the SNS (Social Networking Service),there are some phenomena,which tag description of resources is not high and the structure is mussy.ln order to improve these problems,a new kind of condensed type hierarchy clustering algorithm of the tag based on weight and co-occurrence is proposed:first,col ect the tags from the SNS.Then calculate the weight of tags with the page.Calculate the similarity of the tags' co-occurrence,and use them as the initial data in type hierarchy clustering.Final y,calculate the accuracy,recal rate and the weighted harmonic mean F1 through comparing the clustering results with the results of manual classification.There is a good effect through the experiment of this new algorithm.
出处
《工业控制计算机》
2014年第6期116-117,120,共3页
Industrial Control Computer
基金
广东省自然科学基金S2011010003681~~
关键词
标签
聚类
社交网络
权重
共现
tag
cIustering
SNS
weight
co-occurrence