期刊文献+

一种基于随机游动的聚类算法 被引量:6

A Random Walk Based Clustering Algorithm
下载PDF
导出
摘要 该文提出一种改进的随机游动模型,并在此模型的基础上,发展了一种数据聚类算法。在此算法中,数据集中的样本点根据改进的随机游动模型,生成有权无向图G(V,E,d),其中每个样本点对应图G的一个顶点,并且假设每个顶点为可以在空间中移动的Agent。随后计算每个顶点向其邻集中顶点转移的概率,在随机选定邻集中的一个顶点作为转移方向后,移动一个单位距离。在所有样本点不断随机游动的过程中,同类的样本点就会逐渐的聚集到一起,而不同类的样本点相互远离,最后使得聚类自动形成。实验结果表明,基于随机游动的聚类算法能使样本点合理有效地被聚类,同时,与其他算法对比也说明了此算法的有效性。 In this paper, a modified model of random walk is proposed, and then a clustering algorithm is developed based on this model. In the algorithm, at first a weighted and undirected graph G(V,E,d) is constructed among data points in a dataset according to the model, where each data point corresponds to a vertex in the graph, and is regarded as an agent who can move randomly in space. Next, the transition probabilities of data points are computed, and then each data point chooses a neighbor randomly in its neighborhood as a transition direction and takes a step to it. As all data points walk in space at random repeatedly, the data points that belong to the same class are located at a same position, whereas those that belong to different classes are away from one another. Consequently, the experimental results demonstrate that data points in datasets are clustered reasonably and efficiently. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the algorithm.
出处 《电子与信息学报》 EI CSCD 北大核心 2009年第3期523-526,共4页 Journal of Electronics & Information Technology
基金 国家自然科学基金(60405012 60675055)资助课题
关键词 数据聚类 随机游动 无监督学习 Data clustering Random walk Unsupervised learning
  • 相关文献

参考文献10

  • 1Merwe V D and Engelbrecht A P. Data clustering using particle swarm optimization. Proceedings of IEEE Congress on Evolutionary Computation 2003 (CEC2003), Canbella, Australia, 2003: 215-220.
  • 2Labroche N, Monmarche N, and Venturini G. AntClust: Ant Clustering and Web Usage Mining. Genetic and Evolutionary Computation Conference, Chicago, IL, USA, 2003: 25-36.
  • 3Cui X H, Gao J Z, and Potok T E. A flocking based algorithm for document clustering analysis. Journal of Systems Architecture, 2006, 52(8): 505-515.
  • 4Yen L, Vanvyve D, and Wouters F, et al.. Clustering using a random walk based distance measure. Proceedings of the 13th Symposium on Artificial Neural Networks, Bruges (Belgium), 2005: 317-324.
  • 5Harel D and Koren Y. Clustering spatial data using random walks. Proceedings of the Seventh ACM SIGKDD Conference, NY, USA, ACM Press, 2001: 281-286.
  • 6Erkan G. Language Model-Based Document Clustering Using Random WaLks. Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, June 2006: 479-486.
  • 7Lovasz L. Random walks on graphs: A survey. Bolyai Society Mathematical Study, Combinatorics, Paul ErdOs is eighty, Keszthely, Hungary, 1993: 1-46.
  • 8Blake C L and Merz C J. UCI Repository of machine learning databases, 1998.
  • 9Song L, Smola A, and Gretton A, et al.. A dependence maximization view of clustering. Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, 2007: 815-822.
  • 10Ding C and Li T. Adaptive dimension reduction using discriminant analysis and K-means clustering, Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, 2007: 521-528.

同被引文献102

引证文献6

二级引证文献161

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部