摘要
传统关联规则挖掘得到的原始规则集包含大量的、杂乱的规则,其中很多是冗余的,这样的规则集难以被用户理解和应用.针对这一问题,探讨了原始规则集与规则集表述之间的关系,提出了一个新的规则集表述模型.该模型包含一个利用概率统计原理构建的推演系统,能够从原始规则集中去除冗余规则,得到无损的规则集表述.这种规则集表述比原始规则集更简洁、更易于理解以及更便于用户管理和应用.更重要的是,该模型得到的规则集表述是无损的,能够实现原始规则集和规则集表述之间的相互推演,保证了信息的完整性.在四个著名数据集上进行的实验表明,规则集表述中的规则数量显著减少.
There are excessive and unorderly rules produced by traditional association rule mining, many of which are redundant. It is difficult for users to understand and use. To solve this problem, the relationship between the original rule-set and the rule-set representation was discussed and a new model was represented. The new model contained an inference system established with statistics to get minimum-redundant and lossless rule-set representation by getting rid of redundant rules. This minimum-redundant and lossless rules-set representation is more concise, more intelligible, and easier to manage and use. Especially, the rule-set is lossless so that it is able to switch to the original rule-set. It is so important because the switching keeps the completeness of rule-set representation. Experiments with four data sets show that the number of rules in the rule-set representation is reduced greatly.
出处
《自动化学报》
EI
CSCD
北大核心
2008年第12期1490-1496,共7页
Acta Automatica Sinica
基金
国家自然科学基金(70671007)资助~~
关键词
关联规则
冗余规则
规则集表述
无损性
Association rules, redundant rules, rules-set representation, lossless