期刊文献+

基于网络舆情的K-Means算法的改进研究 被引量:3

The Improvement of K-Means Clustering Algorithm based on Internet Public Opinion
下载PDF
导出
摘要 传统的K-Means聚类算法只能保证收敛到局部最优,从而导致聚类结果对初始代表点的选择非常敏感;凝聚层次聚类虽无需选择初始的聚类中心,但计算复杂度较高,而且凝聚过程不可逆。结合网络舆情的特点,深入剖析了K-Means聚类算法和凝聚层次聚类算法的优缺点,对K-Means聚类算法进行改进。改进后算法的核心思想是,结合两种算法分别在初始点选择和聚类过程两个方面的优势,进行整合优化。通过实验分析及实际应用表明,改进后的文本聚类算法在很大程度上可以提高网络舆情信息聚类结果的准确性、有效性以及算法的效率。 The traditional K-Means clustering algorithm can only ensure the convergence to a local optimum,leading to the initial clustering results are very sensitive to the choice of representative points.Agglomerative hierarchical clustering option to eliminate the initial cluster centers can be automatically generated for text set at different levels of clustering model,but it is higher in computational complexity,and irreversible aggregation.In this article,analysis deeply the advantages and disadvantages of the K-Means clustering algorithm and agglomerative hierarchical clustering algorithm according to the network characteristics of public opinion,and improving the K-Means clustering algorithm.The core idea of the improved algorithm is combining the advantages of two algorithms at the initial point selection and clustering processes,making integration optimization.Through practical application shows that the improved algorithm can improve the quality and efficiency of the network public opinion information and clustering results.
出处 《电脑开发与应用》 2010年第8期4-6,15,共4页 Computer Development & Applications
基金 山西人事厅资助项目(SX20090108-07)
关键词 网络舆情 文本聚类 K-MEANS算法 凝聚层次聚类 聚类过程 internet public opinion text clustering K-Means algorithm hierarchical agglomerative clustering clustering process
  • 相关文献

参考文献6

二级参考文献46

共引文献172

同被引文献20

引证文献3

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部