期刊文献+

基于概念格的K-Means算法研究

Research on K-Means algorithm based on concept lattice
下载PDF
导出
摘要 针对现有的K-Means算法K值需要人工赋值、随机选取初始中心点、文本表示维度高且缺乏语义的缺陷,提出了一种基于概念格的K-Means算法——K-MeansBCC(K-means algorithm based on concept lattice)。将文本集经预处理转化为形式背景,在此基础上生成概念格;利用概念格中的概念表示文本,根据文本中概念的权重确定K值、选取初始中心点。最后设计了文本间的概念相似度计算公式,并由K-Means算法产生聚类结果。实验结果表明,该算法提高了聚类的效率和准确性。 Aiming at the problems of the existing K-Means algorithm, such as artificial assignation of number of final clustering, random selection of initial centers, high dimension and lack of semantic information in text representation, a new K-Means algorithm called K- MeansBCC is proposed. Firstly, concept lattice is generated on the basis of formal context to which texts are converted by pre-process, then K-MeansBCC expresses texts using the concepts in concept lattice, and determines K values and initial centers according to the weight of concepts, finally the formula of concept similarity between texts is designed, and clustering result by K-Means algorithm is generated. The experimental result show that this algorithm improves the efficiency and accuracy of the clustering.
出处 《计算机工程与设计》 CSCD 北大核心 2011年第2期656-658,662,共4页 Computer Engineering and Design
基金 国家自然科学基金项目(60972090) 辽宁省自然科学基金项目(20072142) 大连市政府IT优秀教师基金项目(大信发2008-40-6)
关键词 K-MEANS算法 概念格 聚类 概念相似度 初始中心点 K-Means algorithm concept lattice clustering concept similarity initial center point
  • 相关文献

参考文献8

  • 1He Ji,Lan Man,Tan Chew-Lim,et al.lnitialization of cluster re-finement algorithms:A review and comparative study[C].Buda- pest, Hungary: IEEE International Joint Conference on Neural Networks,2004:297-302.
  • 2Philipp Cimiano,Andreas Hotho,Steffen Staab.Leaming concept hierarchies from text corpora using formal concept analysis [J]. Journal of Artificial Intelligence Research (JAIR), 2005, (24): 305-339.
  • 3Brandon M Hauff, Jitender S Deogun.Parameter tuning for dis- joint clusters based on concept lattices with application to loca- tion leaming[C].Toronto,Canada:11th International Conference of RSFDGrC,2007:232-239.
  • 4Xu Mantao,Pasi Franti.Delta-MSE dissimilarity in suboptimal k- means clustering[C].Cambridge,UK: 17th International Confere- nce on Pattern Reeognition(ICPR),2004:577-580.
  • 5甘特尔B,威尔R.形式概念分析[M].马垣,张学东,迟呈英,等译.北京:科学出版社,2007:15-46.
  • 6陈龙,范瑞霞,高琪.基于概念的文本表示模型[J].计算机工程与应用,2008,44(20):162-164. 被引量:16
  • 7田素芳.基于概念格与粗糙集的Web文本聚类研究[D].扬州大学,2008.
  • 8Chuang SL,Chien LF.A practical web-based approach to genera- ting topic hierarchy for text segments[C].Washington DC,USA: Proceeding of CIKM,2004:127-136.

二级参考文献7

共引文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部