期刊文献+

文本分类中的语义核函数研究 被引量:8

Research on Semantic Kernel Function in Text Classification
原文传递
导出
摘要 传统的很多文本分类算法都是基于文本特征的数值统计信息来进行分类,只考虑特征在文本中的出现频率,而忽略了文本特征之间的语义相关性。针对文本分类任务,本文提出一种基于本体的语义核函数的构造方法,设计和实现了基于WordNet的语义核函数算法,并将该语义核函数嵌入支持向量机分类器中进行文本分类实验,在20NewsGroups数据集上的分类结果表明,基于语义核函数的支持向量机的分类效果明显优于基于线性核的支持向量机的分类效果。 Many traditional text classification algorithms classify documents based on the terms' statistical information and they only take into account the frequencies of the terms in indexed documents and in the whole collection but ignore the semantic relevance of the documents' terms. In this paper, we propose an approach to the design of a semantic kernel function based on ontology, design and implement an algorithm of WordNet-based semantic kernel function, and then embed this semantic kernel into the Support Vector Machines (SVM) to accomplish a text categorization task. The experimental evaluation on 20 NewsGroups dataset indicates that the performance of the semantic kernel-based SVM outperforms the linear kernel-based SVM.
出处 《情报科学》 CSSCI 北大核心 2010年第7期970-975,979,共7页 Information Science
基金 教育部人文社会科学重点研究基地重大项目(08JJD870225)
关键词 文本分类 语义核函数 本体 支持向量机 text classification semantic kernel function ontology support vector machines
  • 相关文献

参考文献14

  • 1杜小勇,李曼,王珊.本体学习研究综述[J].软件学报,2006,17(9):1837-1847. 被引量:241
  • 2Boser, B., Guyon,I., Vapnik,V. A training algorithm for optimal margin classifier[C].In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory,New York: ACM, 1992:144-152.
  • 3V.Vapnik.The nature of statistical leaning theory[M].Berlin: Springer, 1995 : 181-216.
  • 4韩家炜,等.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2001:162-191.
  • 5G.Siolas,F.d'Alch6-Buc.Support Vector Machines Based on a Semantic Kernel for Text Categorization[C].In Proc. IEEE- INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000),Washington DC:IEEE Computer Society, 2000: 205-209.
  • 6D Mavroeidis,G Tsatsaro, M Vazirgiannis,M Theobald,G Weikum.Word Sense Disambiguation for Exploiting Hierarchical Thesauri in Text Classification[C].In Proc. the 9th Eu-ropean Conference on Principles and Practice of Knowledge Discovery in Databases(PKDD 2005),Berlin:Springer,2005:81-192.
  • 7S.Bloehdorn, R.Basili, M.Cammisa, A.Moschitti.Semantic Kernels for Text Classification based on Topological Measures of Feature Similarity[C].In Proc. the Sixth International Conference on Data Mining (ICDM 2006),Washington DC: IEEE Computer Society, 2006 : S05 - S12.
  • 8Sujeevan Aseervatham,Emmanuel Viennet,Youn'es Bennani. A Semantic Kernel for Semi-Structured Documents [C].In Proc. the Seventh IEEE International Conference on Data Mining (ICDM 2007),Washington DC:IEEE Computer Society, 2007 : 403-408.
  • 9R. Basili,M. Cammisa,A. Moschitti. A Semantic Kernel to Classify Texts with Very Few Training Examples [C].In Proc. Workshop 'Learning in Web Search' ,22nd International Conference on Machine Learning (ICML 2005),New York:ACM,2005:163-172.
  • 10M Shamsfard, AA Barforoush.Learning ontologies from natural language texts[J].International Journal Human-Computer Studies, 2004,60(1): 17-63.

二级参考文献2

共引文献251

同被引文献90

引证文献8

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部