期刊文献+

TGFCM:基于模糊聚类的中文文本挖掘的新方法

TGFCM: A Novel Approach of Chinese Text Mining Based on Fuzzy Clustering
下载PDF
导出
摘要 提出一种新的动态模糊聚类的方法,针对传统的模糊聚类需要预先确定聚类数的问题,提出采用动态自组织映射神经网络来确定聚类数,并通过文本向量空间模型和TF?IDF方法来确定文本的特征向量,再将动态自组织映射神经网络得到的聚类数,用模糊C均值算法(FCM)函数处理,得到聚类的结果。该算法同仅用动态自组织映射神经网络算法的运行结果相比,具有运行聚类结果精度高的优点,模糊聚类更适合处理语义的多样性和文本归属的模糊性,实验验证了算法的有效性。 A novel approach is presented. The main defect of traditional methods of fuzzy clustering is to known the number of clustering in advance. This paper applies the dynamic self-organizing maps algorithm to determining the number of clustering. The text eigenvector is acquired based on the vector space model(VSM) and TF.IDF method. The result of clustering is attained by fuzzy C mean algorithm (FCM). The number of clustering acquired by the dynamic self-organizing maps is introduced into the fuzzy C mean algorithm (FCM). Compared to the dynamic self-organizing maps algorithm, the present algorithm possesses higher precision. The fuzzy clustering is suitable for dealing with the semantic variety and complexity, The example demonstrates the effectiveness of the present algorithm.
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第5期7-9,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60275020)
关键词 自组织映射网络 文本特征向量 模糊聚类 聚类数 Self-organizing maps Text eigenvector Fuzzy clustering Number of clustering
  • 相关文献

参考文献6

  • 1Hung C,Wermter S,Smith P.Hybrid Neural Document Clustering Using Guided Self-organization and Wordnet[J].IEEE Intelligent Systems,2004,19(2):68-77.
  • 2Wang Xizhao,Wang Yadong,Wang Lijuan.Improving Fuzzy C-mean Clustering Based on Feature-weight Learning[J].Pattern Recognition Letters,2004,25(10):1123-1132.
  • 3Rezaee M R,Lelieveldt B P F,Reiber J H C.A New Cluster Validity Index for the Fuzzy C-mean[J].Pattern Recognition Letters,1998,19(3-4):237-246.
  • 4李家福,陆建江,张亚非.模糊聚类算法在汉语文本聚类中的应用[J].计算机工程,2002,28(4):15-16. 被引量:11
  • 5Tao C W.Unsupervised Fuzzy Clustering with Multi-center Clusters[J].Fuzzy Sets and Systems,2002,128(3):305-322.
  • 6王莉,王正欧.TGSOM:一种用于数据聚类的动态自组织映射神经网络[J].电子与信息学报,2003,25(3):313-319. 被引量:28

二级参考文献9

  • 1[1]Rie K A,Lillian L Mostly-unsupervised Statistical Segmentation of Japanese[A].Language Technology Joint Conf of Applied Natural Language Processing and the North American Chapter of the Association for Computational LinguiSticS[C],2000-04
  • 2[2]Kolda T G Limited-memoryMatrix Methods with Applications[D]University of Maryland:College Park,Mayland,1997
  • 3[3]Krishnapuram R,Joshi A,YiL A Fuzzy Relative of the k-Medoids Algorithm with Application to Web Document and Snippet Clustering [A] Korea,ProcIEEE Intl ConfFuzzy Systems-FUZZ IEEE 1 999[C],1999-08
  • 4[4]Hathaway R J,Bezdek J C NERF c-Means Non-Euclidean Relational Fuzzy Clustering[J]Pattern Recognition.1 994,27(3):429-437
  • 5[5]Frigui H.Krishnapuram R Clustering by Competitive Agglomeration [J]Pattern Recognition.1 997.30(7):1109-1119
  • 6[1]M.S. Chen, J. Han, P. S. Yu, Data niining, An overview fiom a database perspective, IEEE Trans. on Knowledge & Data Engineering, 1996, 8(6), 866-883 .
  • 7[2]T. Kohonen, Self-Organization and Associate Memory, Berlin, Springer-Verlag, 1984, Chapter 5.
  • 8[3]D. Alahakoon, S. K. Halgamuge, Dynamic self-organizing maps with controlled growth for knowledge discovery, IEEE Trans. on Neural Networks, 2000, NN-11(3), 601-614.
  • 9[4]D. Choi, S. Park, Self-creating and organizing neural networks, IEEE Trans. on Neural Networks,1994, NN-5(4), 561-575.

共引文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部