摘要
现有的类间桥挖掘算法容易挖掘出伪类间桥,从而降低了挖掘的准确率。文中使用卡方检测方法定义项集重要性,有效地去掉频繁项集中支持度较大但相关性并不强的规则,以此找到真实感兴趣的"桥",而且算法的质量和效率不会受到影响。实验显示,本文算法能避免产生伪类间桥,有效地降低了统计推断中的第二类错误(存伪错误),使挖掘的准确率得到提高。
The current methods for mining class bridge can easily generate false class bridge,thus lowering the mining effectiveness.In this paper,a new algorithm which defines the significance of an item set with a chi-test method,effectively filters the item sets that present larger support but less correlation with the others.In this way,the real and interesting class bridge can be mined and the quality and efficiency of the algorithm are not affected.The experimental results show that the proposed method in this paper can avoid false class bridge,lessen type Ⅱ errors in statistical inference,and improve the accurate rate of mining.
出处
《广西大学学报(自然科学版)》
CAS
CSCD
北大核心
2010年第5期799-806,共8页
Journal of Guangxi University(Natural Science Edition)
基金
国家自然科学基金资助项目(90718020)
广西自然科学基金资助项目(0832005Z)
关键词
数据挖掘
关联规则挖掘
聚类
类间桥
data mining
association rule
clustering
class bridge