期刊文献+

一种基于本体的文本聚类方法 被引量:12

A Novel Text Clustering Method Based on Ontology
下载PDF
导出
摘要 基于本体的文本聚类方法,在文本表示上引入WordNet,并定义了关键概念集,使用WordNet中的概念节点及概念间的语义关系减少文本特征向量维数,提高聚类效果.聚类过程中,算法使用文本的关键概念集和概念特征向量计算文本相似度,利用文本的关键概念集标注聚簇为聚类结果中的各个簇提供解释.实验结果表明,该方法有效地减少了文本特征向量的维数,提高了文本聚类效果以及聚类结果的可解释性. The text clustering method based on ontology applies WordNet and key concept set during text reprensentation, and the concept nodes and the semantic relations between the concepts in the ontology WordNet are used to reduce the number of features so as to improve clustering effect. And during text clustering, the algorithm uses the key concept set and the concept feature vector to calculate the similarity and uses key concept set to provide an explanation for every cluster of the result. The experimental results show that the method can effectively reduce the dimension number of the text feature vector and improve the text clustering effect compared with other text clustering algorithm and the novel method for text clustering can come up with a good explanation for the clusters.
出处 《吉林大学学报(理学版)》 CAS CSCD 北大核心 2010年第2期277-283,共7页 Journal of Jilin University:Science Edition
基金 国家自然科学基金(批准号:60973040 60903098) 教育部高等学校博士学科点专项科研基金(批准号:200801830021) 吉林省自然科学基金(批准号:20070533) 吉林大学基本科研业务费交叉学科与创新项目基金(批准号:200810025)
关键词 本体 WORDNET 关键概念集 概念特征向量 ontology WordNet key concept set concept feature vector
  • 相关文献

参考文献15

  • 1张长胜,孙吉贵,崔妍,杨凤芹.一种基于PSO的分割聚类算法[J].吉林大学学报(工学版),2008,38(6):1371-1377. 被引量:5
  • 2Beyer K, Goldstein J, Ramakrishnan R, et al. When Is "Nearest Neighbor" Meaningful [ C ]//Proceedings of the 7th International Conference on Database Theory. London: Springer-Verlag, 1999: 217-235.
  • 3Fellbaum C. WordNet: an Electronic Lexical Database [ M]. Cambridge: the MIT Press, 1998.
  • 4陆玉昌,鲁明羽,李凡,周立柱.向量空间法中单词权重函数的分析和构造[J].计算机研究与发展,2002,39(10):1205-1210. 被引量:126
  • 5Budanitsky A, Hitst G. Semantic Distance in WordNet: .an Experimental, Application-Oriented Evaluation of Five Measures [ C ]. Proceedings of the NAACL 2001 Workshop on WordNet and Other Iexical Resources. Pittsburgh: Carnegie Mellon University Press, 2001 : 29-34.
  • 6HAN Jia-wei,KAMBER M.数据挖掘:概念与技术[M].北京:机械工业出版社,2007.
  • 7Krishnamurthy B. On Stationary in Internet Measurements through an Information-Theoretic Lens [ C ]//Proceedings of the 1 st IEEE International Workshop on Networking Meets Databases( NetDB' 05). Tokyo : [ s. n. ], 2005.
  • 8Dhillon I S, Modha D S. Concept Decompositions for Large Sparse Text Data Using Clustering [ J]. Machine Learning, 2001,42(12): 143-175.
  • 9Strehl A, Ghosh J. Cluster Ensembles : a Knowledge Reuse Framework for Combining Multiple Partitions [ J ].Journal of Machine Learning Research, 2002, 3: 583-617.
  • 10CSAIL. 20 Newsgroups [ DB/OL]. 2008-01-14. http ://People. csail, mit. edu/jrennie/20Newsgroups/.

二级参考文献17

  • 1龙海侠,须文波,孙俊.基于QPSO的数据聚类[J].计算机应用研究,2006,23(12):40-42. 被引量:14
  • 2Pal S K,Mitra P.Pattern Recognition Algorithms for Data mining:Scalability,Knowledge Discovery and Soft Granular Computing,Chapman and Hall[M].Boca Raton,FL:CRC Press,2004.
  • 3Chiang Jung-Hsien,Hao Pei-Yi.A new kernelbased fuzzy clustering approach:support vector clustering with cell growing[J].IEEE Trans Fuzzy Systems,2003,11(4):518-527.
  • 4Ben-Hur A,Horn D,Siegelmann H T,et al.Support vector clustering[J].J Mach Learn Res,2001,2(2):125-137.
  • 5Lee Sei-Hyung,Daniels Karen.Gaussian kernel width exploration in support vector cluslering[R].University of Massachusetts Lowell,2004.
  • 6Pawlak Z.Rough sets[J].Int J of Computer and Information Sciences,1982,11(5):341-356
  • 7Fletcher R.Practical Methods of Optimization(2nd ed)[M].New York:Wiley-Interscience,2000.
  • 8Kanungo T, Mount D M, Netanyahu N, et al. An efficient K-means clustering algorithm: Analysis and implementation[J]. IEEE Trans Pattern Analysis and Machine Intelligence, 2002, 24 (7) :881-892.
  • 9Kim Kyoung-jae, Ahn Hyunchul. A recommender system using GA K-means clustering in an online shopping market[J] Expert Systems with Applications, 2007, 33(2):317-332.
  • 10Kennedy J, Ebcrhart R C. Particle swarm optimization[C] // Proceedings of the IEEE International Joint Conference on Neural Networks, IEEE Press, 1995, 1942-1948.

共引文献140

同被引文献154

引证文献12

二级引证文献65

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部