期刊文献+

基于免疫网络和SOM的文本聚类算法研究 被引量:3

RESEARCH ON TEXT CLUSTERING ALGORITHM BASED ON ARTIFICIAL IMMUNE NETWORK AND SOM
下载PDF
导出
摘要 文本聚类的核心问题是找到一种优化的聚类算法对文本向量进行聚类,是典型的高维数据聚类,提出一种基于自组织神经网络SOM和人工免疫网络aiNet的两阶段文本聚类算法TCBSA。新算法先用SOM神经网络进行聚类,把高维的文本数据映射到二维的平面上,然后再用aiNet对文本聚类。该方法利用SOM神经网络对高维数据降维的优点,克服了人工免疫网络对高维数据的聚类能力差的缺点。仿真实验结果表明该文本聚类算法不仅是可行的,而且具有一定的自适应能力和较好的聚类效果。 The core of text clustering is to find an optimised clustering algorithm to cluster the text vectors.Text clustering is a typical high dimensional data clustering.In this thesis we put forward a two-phase text clustering algorithm TCBSA based on self-organising neural network SOM and artificial immune network aiNet.First,the SOM neural network is used by the new algorithm for clustering and is used to map the high dimensional text data to 2-dimensional plane,in the second phase the aiNet is then used to cluster the text.The algorithm takes the advantage of SOM neural network in its reducing the dimensions of high dimensional data,and also overcomes the disadvantage of artificial immune network in its poor ability of clustering high dimensional data.Simulation experiments proved that the new text clustering algorithm is feasible;besides,it has certain adaptive ability and pretty good clustering effect as well.
出处 《计算机应用与软件》 CSCD 2010年第5期118-120,124,共4页 Computer Applications and Software
基金 河南省教育厅科研项目(2008A510007)
关键词 文本聚类 相似度 向量空间模型 人工免疫网络 自组织神经网络 Text clustering Similarity Vector special model Artificial immune network Self-organising neural network
  • 相关文献

参考文献8

  • 1Jiawei Han,Micheline Kamber著.数据挖掘概念与技术[M].范明,孟小峰,等译.机械工业出版社,2001.
  • 2Leandro N.de Castro,Jon Timmis.An Artificial Immune Network for Mutimodal Function Optimization[C] //Proceedings of IEEE Congress on Evolutionary Computation(CEC02),Hawaii,May 2002:674-699.
  • 3de Castro L N,Von Zuben F J.aiNet:An Artificial Immune Network for Data Analysis,in Data Mining:A Heuristic Approach,H.Abbass,R.Sarker,and C.Newton,Eds.Sydney,Australia:Idea Droup Publishing,Univ.New South Wales,2001.
  • 4Kohonen T,Kaski S,et al.Self Organization of a Massive Document Collection[J].IEEE Transactions On Neural Network,2000,11(3):574-585.
  • 5姜亚莉,关泽群.用于Web文档聚类的基于相似度的软聚类算法[J].计算机工程,2006,32(2):59-61. 被引量:6
  • 6Modha D Spangler.Feature weighting in k-means clustering[J].Machine learning,2003,52(3):217-237.
  • 7李春华,朱燕飞,毛宗源.一种新型的自适应人工免疫算法[J].计算机工程与应用,2004,40(22):84-87. 被引量:11
  • 8Timmis J.Artificial Immune Systems:A Novel Data Analysis Thechnique Inspired By the Immune Network Theory[D].Department of Comouter Science,University of Wales,Aberystwyth,Ceredigion,Wales,2000,8.

二级参考文献10

  • 1杨斌,孟志青.一种文本分类数据挖掘的技术[J].湘潭大学自然科学学报,2001,23(4):34-37. 被引量:10
  • 2Leandro Nunes de Castro,Fernando J Von Zuben. An Evolutionary Immune Network for Data Clustering[C].In:Proceedings of IEEE Computer Society Press ,SBRN'00(Brazilian Symposium on Neural Networks) ,2000; 1: 84~89
  • 3J Timmis.Artificial Immune Systems:A Novel Data Analysis Technique Inspired By the Immune Network Theory[D].Department of Computer Science,University of Wales,Aberystwyth,Ceredigion,Wales,2000-08
  • 4D Dasgupta. Artificial Immnune Systems and their Applications[M].Springer,New York, 1998
  • 5Zhou Haofeng, Lou Yubo. Refining Web Authoritative Resource by Frequent Structures[C]. In: Proceedings of the Seventh International Database Engineering and Applications Symposium(IDEAS2003),2003.
  • 6Wu Fei, Gardarin G. Gradual Clustering Algorithm[C]. In: Proceedings of Seventh Intematkmal Conference on Database Systems for Advanced Applications, 2001: 48-55.
  • 7Lin K I, Kondadadi R. A Similarity-based Soft Clustering Algorithm for Documents[C]. In: Proceedings of Seventh International Conference on Database Systems for Advanced Applications, 2001:40-47.
  • 8王继成,潘金贵,张福炎.Web文本挖掘技术研究[J].计算机研究与发展,2000,37(5):513-520. 被引量:275
  • 9杨靖涛,王学林,胡于进.一种基于相似性的文档聚类算法[J].华中科技大学学报(自然科学版),2002,30(12):59-61. 被引量:2
  • 10梅馨,邢桂芬.文本挖掘技术综述[J].江苏大学学报(自然科学版),2003,24(5):72-76. 被引量:29

共引文献14

同被引文献33

  • 1莫宏伟,徐立芳.人工免疫网络记忆分类并行算法研究[J].计算机工程与应用,2005,41(27):16-18. 被引量:1
  • 2MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.1967:14.
  • 3DEFAYS D.An efficient algorithm for a complete link method[J].The Computer Journal,1977,20(4):364-366.
  • 4FUNG B C,WANG K,ESTER M.Hierarchical document clustering using frequent itemsets[C]//Proceedings of the SIAM International Conference on Data Mining.2003:59-70.
  • 5BAKUS J,HUSSIN M,KAMEL M.A SOM-based document clustering using phrases[C]//Proceedings of the 9th International Conference.2002:2212-2216.
  • 6ROMERO F P,PERALTA A,SOTO A,et al.Fuzzy optimized self-organizing maps and their application to document clustering[J].Soft Computing-A Fusion of Foundations,Methodologies and Applications,2010,14(8):857-867.
  • 7LUO X,XU Z,YU J,et al.Building association link network for semantic link on web resources[J].Automation Science and Engineering,2011,8(3):482-494.
  • 8LUO X,YAN K,CHEN X.Automatic discovery of semantic relations based on association rule[J].Journal of Software,2008,3(8):11-18.
  • 9XU Z,LUO X,LU W.Association link network:an incremental semantic data model on organizing web resources[C]//Proceeding ICPAD'09 Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems.2009:793-798.
  • 10RAGHAVAN U N,ALBERT R,KUMARA S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76:036106.

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部