摘要
应用粗糙集进行文本自动分类需要解决的一个核心问题是规则匹配问题。随着文本信息不断地增多,在文本分类系统中,通常忽视训练集的相对固定特性与新文献不断变化之间的矛盾。系统中新文献的快速加入,原有训练出来的分类规则与新文献的匹配能力和分类准确率会变得越来越低,有的新文献在分类规则中根本找不到匹配规则,本文针对上述问题通过研究一种动态类别扩展方法,提出了一种新的模式匹配规则算法。
Matching rules for automatic classification are a core issue by using rough set. We usually ignore the contradiction between the relatively fixed characteristic for training set and the ever- changing of documents. With the rapid accession for the new literature in system, the matching capacity for the original training classification rules and new literature, and classification accuracy become increasingly low. Some of the new literature can' t find the matching rules in classification rules. In view of the aforementioned problems, the paper proposes a new arithmetic for pattern matching by studying a dynamic category extension method.
出处
《信息技术与信息化》
2008年第4期71-73,共3页
Information Technology and Informatization
关键词
匹配规则
粗糙集(RS)
决策树
类别扩展
Matching rules
Rough set
Decision - making tree
Category extension