摘要
目前的关联规则挖掘算法主要依靠基于支持度的剪切策略来减小组合搜索空间.如果挖掘潜在的令人感兴趣的低支持度模式,这种策略并非有效.为此,提出一种新的关联模式—可信关联规则(credible association rule,简称CAR),规则中每个项目的支持度处于同一数量级,规则的置信度直接反映其可信程度,从而可以不必再考虑传统的支持度.同时,提出MaxcliqueMining算法,该算法采用邻接矩阵产生2-项可信集,进而利用极大团思想产生所有可信关联规则提出并证明了几个相关命题以说明这种规则的特点及算法的可行性和有效性.在告警数据集及Pumsb数据集上的实验表明,该算法挖掘CAR具有较高的效率和准确性.
Existing association-rule mining algorithms mainly rely on the support-based pruning strategy to prune its combinatorial search space. This strategy is not quite effective in the process of mining potentially interesting low-support patterns. To solve this problem, the paper presents a novel concept of association pattern called credible association rule (CAR), in which each item has the same support level. The confidence directly reflects the credible degree of the rule instead of the traditional support. This paper also proposes a MaxCliqueMining algorithm which creates 2-item credible sets by adjacency matrix and then generates all rules based on maximum clique. Some propositions are verified and which show the properties of CAR and the feasibility and validity of the algorithm. Experimental results on the alarm dataset and Pumsb dataset demonstrate the effectiveness and accuracy of this method for finding CAR.
出处
《软件学报》
EI
CSCD
北大核心
2008年第10期2597-2610,共14页
Journal of Software
基金
国家高技术研究发展计划(863)
高等学校学科创新引智计划~~
关键词
可信关联规则
极大团
数据挖掘
邻接矩阵
告警关联
credible association rule
maximum clique
data mining
adjacency matrix
alarm correlation