摘要
提出并设计了一种用于高维稀疏相似矩阵的文本聚类算法.该算法结合了层次聚类和划分聚类的思想,通过一个阈值来控制聚类算法的选取和新簇的建立.从一个小样本的实验结果来看,该算法的召回率和正确率比各种经典的方法更高.
This paper has proposed and realized a kind of text clustering algorithm used for high dimensional sparse similar matrix. This algorithm combines the idea of hierarchical clustering and partitioning clustering, and it could control the selection of clustering algorithm and the creation of new clusters through a threshold. Judging from a small sample experimental result, the algorithm's recalling rate and correct rate is higher than some classical methods
出处
《陕西科技大学学报(自然科学版)》
2008年第6期163-166,共4页
Journal of Shaanxi University of Science & Technology
关键词
文本聚类
聚类算法
中文分词
text clustering
clustering algorithm
Chinese word segmentation