期刊文献+

一种基于本体论的文本特征选取方法 被引量:1

A Ontology-based Document Feature Extraction
下载PDF
导出
摘要 针对文本特征向量高维数的问题,给出了一种基于本体论的文本特征选取方法。通过由专业领域本体所建立的概念树,把文本的特征项映射到概念,同时进行特征项频度到概念频度的转换,使得选取得到的特征概念能够很好表征文本的内容。实验结果表明,与未进行特征概念选取相比,采用此方法选取得到的特征概念能够在尽可能减少对文本分类精度的影响下,达到降低特征维数的目的。 To effectively reduce the dimension of document vectors, we introduce a novel method employing domain ontology to extract feature concept. For all document categories, all raw words in each category are mapped to concepts in their relative concept tree derived from the domain ontology. At the same time the frequency of raw words is trans-formed into the frequency of concepts. Experimental results show that this method can effectively reduce the dimension of document vectors without loss of categorization accuracy, compared with traditional document vectors.
出处 《计算机科学》 CSCD 北大核心 2008年第3期152-154,共3页 Computer Science
基金 福州大学科技发展基金(2005-XQ-13 2006-XQ-22 XRC-0511) 福建省教育厅(JB06023)资助
关键词 本体 文本特征 文本分类 特征选取 Domain ontology, Document feature, Text classification, Feature selection
  • 相关文献

参考文献6

  • 1Fox C. Lexical Analysis and Stoplists. In Information Retrieval: Data Structure & Algorithms. In: Frakes W B, Baeze-Yates R, eds. P T R Prentice Hall, 1992. 102-130
  • 2Frakes W B. Stemming Algorithms. In Information Retrieval:Data Structure & Algorithms. In:Frakes W B, Baeza-Yates R, eds, T P R Prentice Hall, 1992. 131-160
  • 3Hotho A,Staab S, Maedche A. Ontology-based Text Clustering. IJCAI'01-Workshop Text Learning,, Beyond Supervision. Seattle, USA, 2001
  • 4Bill b, McKay R, Abbass H A, Michael B. A Comparative Study for Domain Ontology Guided Feature Extraction. In: Proc. of The Twenty- Fifth Australian Computer Science Conference. Conferences in Research and Practice in Information Technology, 2003,16
  • 5Hotho A,Staab S,Stumme G. Wordnet improves Text Document Clustering In:Proc. of the SIGIR 2003 Semantic Web Workshop, 2003
  • 6Zhang Kai,Sun Jian, Wang Bin. A Wordnet-based Approach to Feature Selection in Text Categorization Intelligent information processing II table of contents, 2004

同被引文献9

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部