期刊文献+

一种改进的基于规则实例多覆盖分类算法 被引量:7

Improved Rule Based Classification Algorithm with Multiple Covering Instances
下载PDF
导出
摘要 基于规则分类算法提取的规则集通常存在3个问题:首先,提取的分类规则集中短规则过少,致使高质量的规则不多;其次,规则集中规则数量少,训练数据中几乎所有实例仅被规则覆盖一次;第三,虽然提取大量的规则,但是训练数据中存在一些小类样本的实例不能被任何一条规则覆盖。本文提出一种改进的基于规则的实例多覆盖分类算法(Rule-based classification with instances covered by multiple rules,RCIM),其特点是:(1)为了提高规则的质量,在选择生成规则的第1项时不仅考虑属性值的好坏,而且还考虑了属性值补的好坏;(2)一次产生尽量多,高质量的规则,而且当训练数据的实例至少被两条规则覆盖后才将其删除;(3)当遇上难以判断的测试数据时,对测试数据的各个属性值进行二次学习提取规则。算法RCIM不仅可以有效地提取大量的规则,而且较大程度地提高了规则的质量。通过在大量数据上实验结果表明,RCIM比许多其他算法取得了更高的分类准确率。 There are three problems in rule set which is extracted based on classification algorithm. First, too few short rules in the extracted classification rule set decrease the number of high quality rules. Second, there are such few rules in rule set that almost all of the examples in the training data can be covered only once. Third, despite lots of extracted rules, some examples of small classes in the training data fail to be covered by any of these rules. Herein, a modified example multiple coverage classification algo- rithm RCIM, which is based on generated rules, is proposed. Here are the features: (1) for the purpose of improving the quality of rules, not only the quality of attribute value but also that of its complement can be taken into account when choosing the first item of a generated rule. (2) It can generate high quality rules at a time as many as possible. (3) It deletes the examples in the training data only if they are covered at least twice. What's more, it can restudy each of the attribute value of the test data to extract rules when encountering the data difficult to judge. The algorithm RCIM not only can efficiently extract a large quantity of rules but also largely improve the quality of rules. Experimental results in many data show that RCIM has achieved higher classification accuracy than many other algorithms.
出处 《数据采集与处理》 CSCD 北大核心 2017年第6期1232-1238,共7页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61170129)资助项目
关键词 数据挖掘:分类 分类规则 规则覆盖 data mining classification classification rule rule covering
  • 相关文献

同被引文献96

引证文献7

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部