摘要
传统的Apriori算法通过支持度阈值和置信度阈值的筛选来挖掘强关联规则,所挖掘出来的强关联规则并不都是有趣的,也忽略了负关联规则的挖掘,失去了负关联规则在决策分析中的重要作用。为了过滤强关联规则中的无趣规则,挖掘有趣的正负关联规则,引入了兴趣度的概念,对现有的几种兴趣度度量进行了研究,利用兴趣度度量的相关性质,提出了一种新的兴趣度度量。根据支持度、置信度、兴趣度的相关性质提出了相关定理并进行证明,用以挖掘有趣的正负关联规则。在新的兴趣度度量的基础上进行了算法设计,并采用真实数据集进行算法验证。结果显示,以提出的兴趣度度量为基础进行正负关联规则的挖掘是可行的,其结果比经典的Apriori算法挖掘方法更实用更有效。
The mining of traditional Apriori algorithm is for strong association roles through support and confidence threshold. The rules mined are not all interesting, and also ignore the mining of negative association rules and lose the important role of the negative associa- tion rules in decision analysis. In order to filter the uninteresting rules in strong association rules and excavate interesting positive and neg- ative association rules, the concept of interestingness is introduced. We propose a new interestingness measurement based on research on several interestingness measurements with some related properties of them. The related theorems are proposed and proved according to the relative properties of support degree, confidence degree and interestingness degree to excavate the interesting positive and negative associa- tion rules. Based on the new interestingness measurement, the algorithm is designed and the experiment is carried out with real data. The results show that the mining of positive and negative association rules based on the proposed interestingness measure is more practical and effective than the classical Apriori algorithm.
作者
马彦勤
武彤
邓烜堃
MA Yan-qin;WU Tong;DENG Xuan-kun(School of Computer Science and Technology, Guizhou University, Guiyang 550025, Chin)
出处
《计算机技术与发展》
2018年第5期38-41,46,共5页
Computer Technology and Development
基金
贵州省科技计划项目(黔科合GY字[2010]3061)
关键词
支持度
置信度
兴趣度度量
正负关联规则
数据挖掘
support
confidence
interestingness measurement
positive and negative association rules
data mining