摘要
针对以频繁项集产生-规则产生为核心的两阶段关联规则挖掘,存在需要人工以先验知识指定最小支持度和最小置信度阈值的缺陷。本文提出以支持数和置信度为依据,采用曲线拟合技术,根据可决系数自动确定曲线的次数及对应多项式的算法AARM_BR(Adaptation Association Rule Mining Based on Determination Coefficient R^2),从而确定支持度和置信度阈值。在标准数据集Trolley和Groceries上进行关联规则挖掘实验,结果表明本算法更具有数据依赖性,在用户不具备先验知识的情况下,无须人为指定多项式阶次、支持度和置信度阈值的优点。
The two-stage association-rule-mining algorithm based on the frequent item set generation and rule generation requires the manual assigning of minimum support and minimum confidence.To overcome this defect,this paper proposes a new method using the curve fitting technology based on the number of supports and confidence,in which the number of the order of curve and corresponding polynomial is automatically determined by a determination coefficient,which is called"adaptation association rule mining based on the determination coefficient R^2"(AARM_BR).As the proposed AARM_BR method is driven by data,the thresholds of support and confi-dence can be automatically obtained.The experiments on two standard datasets Trolley and Groceries show that compared with a recently published method,the proposed method is more data-dependent and automatically determines the number of order of polynomial and the threshold of support and confidence under the circumstance of not having a priori knowledge.
作者
王雪平
林甲祥
巫建伟
高敏节
WANG Xueping;LIN Jiaxiang;WU Jianwei;GAO Minjie(College of Computer and Information Sciences,Fujian Agriculture and Forestry University,Fuzhou 350002,China;Third Institute of Oceanography,Ministry of Natural Resources,Xiamen 361001,China)
出处
《智能系统学报》
CSCD
北大核心
2020年第2期352-359,共8页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金项目(41401458)
福建省自然科学基金项目(2018J01644,2018J01645,2016J01753)
中国-东盟海上合作基金项目(2020399)
国家海洋局第三海洋研究所项目(2016020)
福建省中青年教师教育科研项目(JT180129)。
关键词
关联规则
阶次
自适应
可决系数
规则
支持度
置信度
曲线拟合
多项式
数据挖掘
association rule
order
adaptive
coefficient of determination
rule
support
confidence
curve fitting
polynomial
data mining