摘要
文章在已有的微博分类研究的基础上,提出一种基于热门微博下转发用户聚类的微博分类方法,使得分类结果能够在公安工作中有更大的利用价值。文章所使用的聚类算法采用了现如今比较成熟的K-means聚类算法以及对其进行改进之后的X-means聚类算法,X-means算法使用了更加科学的BIC准则作为类别之间的相似性度量,而且用户在使用X-means算法时无需再指定聚类个数,只需要划定聚类范围就可以了,通过这样的机制,X-means算法提高了聚类的准确性和科学性。经过对实验结果的对比分析,发现X-means算法得出的聚类结果拟合性更好。因此,在微博分类研究中将会使用X-means算法进行用户聚类。另外,文章还列举了不同种类的微博下的用户聚集情况,并为网络安全主管部门提出了针对不同种类微博的应对策略。
On the basis of the existing classification of mieroblogging,this paper proposes a classification method based on the clustering of user which has forwarded a popular microblogging. By using this method,the classification result we obtain will be more useful in the policing work. Clustering algorithm used in the text is a maturealgorithm called K-means and its improved algorithm called X-means.X-means algorithm uses a more scientific criterion called BIC to measure the similarity between the classes, and users no longer need to specify the number of clusters. All they need to do is just specifying the number of clusters range. By this kind of mechanism, X-means clustering algorithm is able to improve its accuracy and scientific.We analyzed and compared the results of the experiment and find that the results of X-means clustering algorithm derived fit better than K-meansclustering algorithm, and therefore,this paper will use X-means clustering algorithm in the microblogging category study.In addition, this paper listed case of different types of users gathered under different kinds of microblogging, and proposed different strategiesto the different kinds of microblogging.
出处
《信息网络安全》
2015年第7期84-89,共6页
Netinfo Security
基金
公安部重点研究计划项目[2011ZDYJGADX016]
关键词
用户聚类
热门微博
分类
user clustering
popular microblogging
classification