期刊文献+

基于Ontology的Web文本分类法 被引量:2

An Ontology-based Approach of Web Texts Classification
下载PDF
导出
摘要 传统方法处理文本分类时都需要进行文本训练,并且在文本表示时需要抽取特征项。搜集训练文本的过程需要费时费力的人工参与,而且中文信息的特征项抽取工作难度较大。为了解决这些问题,本文探讨了一种新的文本分类法——基于Ontology的Web文本分类法。该方法首先通过“知网”建立一个Ontology,然后根据分类体系建立每个类的Ontology,最后根据每个类的Ontology对文本进行分类。试验表明这种分类法与KNN分类法在准确率上相当,但比KNN方法稳定,在召回率上优于KNN方法。 Traditional methods for text classification need text-training and characteristic abstracting in the step of textexpression. The work of collecting training texts is laborious and time-consuming. Additionally, it is difficult to abstract the characteristics from Chinese texts. In order to resolve above problems, this paper puts forward an approach of ontology-based web text classification. Firstly, the approach establishes an ontology model based on Hownet theory. Then, it creates ontologies for each subclass of the classification system. The web texts classification is performed using these ontologies. Comparing with the method of KNN, the results of our experiments indicate that the accuracy of ontology-based approach is close to KNN' s, its algorithms is more steady than KNN's, and its recalling rate is more eminent than KNN's.
出处 《情报学报》 CSSCI 北大核心 2006年第2期202-207,共6页 Journal of the China Society for Scientific and Technical Information
基金 浙江省自然科学基金资助项目(No.M063149)
关键词 ONTOLOGY 文本分类 知网 Ontology, text classification, Hownet.
  • 相关文献

参考文献8

  • 1C.Cortes,V.Vapnik.Support vector networks.Machine Learning,1995,20(3):273~297
  • 2Li Baoli,Lu Qin,Yu Shiwen.An adaptive k-nearest neighbor text categorization strategy.ACM Transactions on Asian Language Information Processing (TALIP).New York:ACM Press,2004.215~226
  • 3Min-Yen Kan.Web page classification without the web page.In:Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Posters Table of Contents,New York,NY,USA,2004.New York:ACM Press,2004.262~263
  • 4Yong Zhao.Ontology Resource Page.http://people.cs.uchicago.edu/~yongzh/ Ontology.html,2005-04-20
  • 5Dong Zhendong.HowNet Knowledge Database,http://www.keenage.com/,2005-04-20
  • 6B.Lauser,T.Wildemann,A.Poulos,F.Fisseha,J.Keizer,and S.Katz.A comprehensive framework for building multilingual domain ontologies:Creating a prototype biosecurity ontology,DC-2002:Metadata for e-Communities:Supporting Diversity and Convergence,Florence,Italy,October 2002
  • 7W3C.RDF Vocabulary Description Language 1.0:RDF Schema.http://www.w3.org/TR/2004/ REC-rdf-schema-20040210/,2005-04-20
  • 8Marc Ehrig,Alexander Maedche.Ontology-focused crawling of Web documents.In:Proceedings of the 2003 ACM Symposium on Applied Computing,Melbourne,Florida,2003.New York:ACM Press,2003.1174~1178

同被引文献10

引证文献2

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部