摘要
在对文档聚类的含义、作用和一般过程的阐述基础上,分析一种基于"最小最大"原则初始质心优选的改进K-means聚类的基本思想,并重点设计相关的聚类算法,实现聚类系统,基于系统对300篇学术文档及其相关特征词语进行聚类实验。实验结果表明,本文所设计和实现的改进K-means的聚类算法表现出较好的性能。
After a concise introduction of conotation,functions and general processs of textual document clustering, this paper expotiates the basic mechanism of a kind of improved K - means clustering based on initial eentroids selection through minimum - maximum principle, designs its algorithm, implements the clustering system, and conducts several experiments taking 300 academic articles and relative characteristic words for instances, which prove the good performance of the algorithm proposed.
出处
《现代图书情报技术》
CSSCI
北大核心
2008年第12期73-79,共7页
New Technology of Library and Information Service