摘要
提出了一个基于本体的异构文本分类系统,使用结构本体很好地消除了文本文档的结构上的差异,并将领域本体引入到分类系统中,使得分类更加准确、高效,分类的规则更易理解。在使用COSA算法提取有效概念的同时,也大大地减少了关键术语的数量,节省了运算开销。
This paper brings out an ontology-based heterogeneous text categorization system, which exploits structure ontology to avoid the structure difference of text documents, and introduces domain ontology to this text categorization system. All the measures mentioned above make the system more accurate and more effective, the rules for categorization more understandable. When COSA algorithm is used to get meaningful concepts, the number of
出处
《计算机工程》
CAS
CSCD
北大核心
2004年第21期123-125,共3页
Computer Engineering
关键词
本体
异构文本分类系统
知识工程
机器学习
规则
is reduced enormously. Key words Ontology
Knowledge engineering
Machine learning
Text categorization
Rule