关联规则的冗余删除与聚类被引量：15

Pruning and Clustering Discovered Association Rules

下载PDF

导出

摘要关联规则挖掘常常会产生大量的规则,这使得用户分析和利用这些规则变得十分困难,尤其是数据库中属性高度相关时,问题更为突出.为了帮助用户做探索式分析,可以采用各种技术来有效地减少规则数量,如约束性关联规则挖掘、对规则进行聚类或泛化等技术.本文提出一种关联规则冗余删除算法ADRR和一种关联规则聚类算法ACAR.根据集合具有的性质,证明在挖掘到的关联规则中存在大量可以删除的冗余规则,从而提出了算法ADRR;算法ACAR采用一种新的用项目间的相关性来定义规则间距离的方法,结合DBSCAN算法的思想对关联规则进行聚类.最后将本文提出的算法加以实现,实验结果表明该算法是有效可行的,且具有较高的效率. A common problem in association rule mining is that a large number of rules are often generated from the databases, which makes it difficult for users to analyze and makes use of these rules. This is particularly true for data sets whose attributes are highly correlated. To facilitate exploratory analysis, the number of rules can be reduced significantly by techniques such as mining association rules with constraint items, post-pruning or clustering and summarizing rules. This paper proposed algorithms ADRR and ACAR to overcome this problem. Firstly, algorithm ADRR prunes the discovered associations by removing those redundant associations according to the property of the set, and then algorithm ACAR makes use of the correlation information of the items to measure the distances between rules, Therefore, clustering algorithm DBSCAN is applied to generate the clustering structure suitable for exploratory analysis. Finally, an experiment is conducted on a real-life database and the experimental result shows that the method is practical and effective.

作者韦素云吉根林曲维光

机构地区南京师范大学计算机系苏州大学省计算机信息处理重点实验室

出处《小型微型计算机系统》 CSCD 北大核心 2006年第1期110-113,共4页 Journal of Chinese Computer Systems

基金江苏省重点实验室开放基金(KJS03064)资助.

关键词关联规则相关性聚类 association rules correlation clustering

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献13

1Srikant R, Vu Q, Agrawal R. Mining association rules with item constraints[C]. In: Proc. 1997 Int. Conf. Knowledge Discovery in Databases and Data Mining, Newport Beach, California, 1997,67-73.
2崔立新,苑森淼,赵春喜.约束性相联规则发现方法及算法[J].计算机学报,2000,23(2):216-220. 被引量：62
3Ng R, Lakshmanan L V S, Han Jet al. Exploratory mining and pruning optimizations of constrained associations rules[C]. In:Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data,Seattle, WA, 1998,13-24.
4Aggrawal C, Yu P S. Online generation of association rules[C]. In: Proc. 1998 Int. Conf. Data Engineering, Orlando,FL, 1998,402-411.
5Toivonen H, Klemettinen M, Ronkainen Pet al. Pruning and grouping discovered association rules[Z]. In MLnet Workshop on Statistics, Machine Learning, and Discovery in Databases,Crete, Greece, 1995, 47-52.
6Liu B, Hsu W, Ma Y. Pruning and summarizing the discovered associations[C]. In: Proc. 1999 ACM-SIGKDD Int. Conf.Knowledge Discovery and Data Mining. San Diego, CA. USA,1999,125-134.
7Li J, Shen H, Topor R. Mining the smallest association rule set for predictions[C]. In: Proc. 2001 IEEE Int. Conf. Data Mining. San Jose, California, USA, 2001,361-368.
8Lent B, Swami A, Widom J. Clustering association rules[C].In: Proc. 1997 Int. Conf. Data Engineering, Birminghan, England, 1997,220-231.
9Gupta G K, Strehl A, Ghosh J. Distance based clustering of association rules[C]. In: Proc. Intelligent Engineering Systems Through Artificial Neural Networks , St. Louis, Missouri:ASME Press, 1999, 759-764.
10An A, Khan S, Huang X. Objective and subjective algorithms for grouping association rules [C]. In: Proc 2003 IEEE Int.Conf. Data Mining, Melbourne, Florida, 2003, 477-480.

二级参考文献15

1Han J，Proc of the 21st International Confer-ence on Very L arge Databases，1995年，420页
2E G Hetzler, W M Harris, S Harvre et al. Visualizing the full spectrum of document relationships. In: Proc of the 5th Int'l Society for Knowledge Organization Conference. Würzburg: Ergon, 1998. 168～175
3P C Wong, P Whitney, J Thomas. Visualizing association rules for text mining. In: Proc of IEEE Symposium on Information Visualization(INFOVIS'99). San Francisco: IEEE Computer Society, 1999. 120～123
4M Hao, M Hsu, U Dayal et al. Market basket analysis visualization on a spherical surface. HP Labs, Technical Report: HPL-2001-3, 2001
5H Toivonen, M Klemettinen, P Ronkainen et al. Pruning and grouping discovered association rules. The ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, 1995
6G K Gupta, A Strehl, J Ghosh. Distance based clustering of association rules. In: Proc of ANNIE, St. Louis, Missouri: ASME Press, 1999. 759～764
7M Ankerst, M Breunig, H P Kriegel et al. OPTICS: Ordering points to identify the clustering structure. In: Proc of 1999 ACM-SIGMOD Int'l Conf Management of Data (SIGMOD'99). Philadephia: ACM Press, 1999. 49～60
8J Han, Y Fu. Discovery of multiple level association rules from large databases. In: Proc of the 21st Int'l Conf on Very Large Databases(VLDB'95). Zurich: Morgan Kaufmann, 1995. 420～431
9R Srikant, R Agrawal. Mining generalized association rules.In:Proc of the 21st Int'l Conf on Very Large Databases(VLDB'95). Zurich: Morgan Kaufmann, 1995. 407～419
10A Savasere, E Omiecinski, S Navathe. Mining for strong negative associations in a large database of customer transactions. In: Proc of the 14th Int'l Conf on Data Engineering. Orlando: IEEE Computer Society, 494～502

共引文献77

1简友光,简曙光.空间数据关联规则挖掘研究综述[J].计算机与数字工程,2007,35(7):52-55.
2吴春旭,陈家耀,刘博文.一种挖掘频繁闭项集的改进算法[J].计算机系统应用,2008,17(10):32-35. 被引量：1
3彭小娟,郑冬花.寿险事务数据库挖掘关联规则的分析和设计[J].科技资讯,2008,6(16):232-233.
4蔡红,陈荣耀,陈波.关联规则挖掘最小支持度阀值设定的优化算法研究[J].微型电脑应用,2011(6):33-36. 被引量：9
5吴春旭,陈家耀,刘博文.一种改进CLOSET算法[J].中国管理科学,2008,16(S1):108-112.
6崔立新,赵蕾,李海玉.聚类算法在入侵检测中的应用[J].电脑编程技巧与维护,2009(S1):75-77.
7朱玉全,宋余庆,陈耿.约束最大频繁项目集的增量式更新算法[J].计算机工程,2004,30(18):31-32.
8李宏,杜剑峰,陈松乔.分布式数据库约束性关联规则挖掘[J].中南大学学报（自然科学版）,2004,35(6):998-1003. 被引量：1
9杨文杰,胡明昊,唐振民,杨静宇.一种有效的基于约束的关联规则发现算法[J].南京理工大学学报,2005,29(1):109-112. 被引量：2
10宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量：21

同被引文献92

1胥桂仙,高旭,于绍娜.关联规则算法在中文文本挖掘中的应用研究[J].中央民族大学学报（自然科学版）,2004,13(4):332-338. 被引量：5
2董祥军,王淑静,宋瀚涛.基于两级支持度的正、负关联规则挖掘[J].计算机工程,2005,31(10):16-18. 被引量：19
3朱恒民,姬小利,王宁生.一种挖掘意外规则的方法[J].南京航空航天大学学报,2005,37(3):381-385. 被引量：1
4刘乃丽,李玉忱,马磊.一种有效且无冗余的快速关联规则挖掘算法[J].计算机应用,2005,25(6):1396-1397. 被引量：7
5朱靖波,陈文亮.基于领域知识的文本分类[J].东北大学学报（自然科学版）,2005,26(8):733-735. 被引量：12
6陈慧萍,王建东,叶飞跃.MAXFP-Miner:利用FP-tree快速挖掘最大频繁项集[J].控制与决策,2005,20(8):887-891. 被引量：4
7马建庆,钟亦平,张世永.基于兴趣度的关联规则挖掘算法[J].计算机工程,2006,32(17):121-122. 被引量：20
8井福荣,谢辅雯.关联规则在网站结构优化中的改进算法[J].计算机系统应用,2007,16(1):44-46. 被引量：8
9赵亮,萧德云,刘震涛.一种用于挖掘正负关联规则的可量化标准[J].计算机工程,2007,33(2):56-58. 被引量：10
10Oosthuizen,G.D.,&Venter, F.J.(1995). Using a Lattice for Visual Analysis of Categorical Data. In Perceptural Issues in Visualization (pp. 142-155). Berlin: Springer.

引证文献15

1杨越越,董祥军,翟延富.冗余关联规则删减技术研究综述[J].山东轻工业学院学报（自然科学版）,2007,21(4):31-33.
2刘路,李弼程,张先飞.基于向量相似度修正策略的命名实体关联分析[J].计算机工程与应用,2008,44(2):179-181.
3黄振国,沈夏炯.无冗余关联规则在财政收支分析中的应用[J].现代计算机,2008,14(11):73-76.
4蒋欣,李伟华,史豪斌,潘炜.基于距离的关联规则相关性分析优化方法[J].计算机工程与应用,2009,45(7):138-140. 被引量：3
5田宏,王亚伟,王毅.改进的基于距离的关联规则聚类[J].计算机工程与设计,2009,30(5):1204-1206. 被引量：3
6许娅.最简有效关联规则及其挖掘算法[J].电脑与信息技术,2009,17(5):24-27. 被引量：1
7苗茹,沈夏炯,胡小华.概念格上无冗余关联规则的提取算法NARG[J].计算机工程,2009,35(22):74-76. 被引量：3
8李其申,屈喜琴,管俊.关联规则的相似性度量与聚类研究[J].计算机工程与设计,2012,33(2):745-749. 被引量：7
9万福才,唐明慧,陈晓.基于外部环境的关联规则挖掘[J].计算机技术与发展,2013,23(1):115-118.
10牛新征,杨健,周明天.基于主观兴趣度的关联规则优化算法[J].四川大学学报（工程科学版）,2013,45(4):131-139. 被引量：2

二级引证文献50

1杨霁琳.一种基于概念格的规则提取方法及其应用[J].计算机科学,2012,39(S3):204-206. 被引量：2
2武玉刚,秦勇,宋继光,杨忠明.基于关联规则的入侵检测算法研究综述[J].计算机工程与设计,2011,32(3):834-838. 被引量：7
3李其申,屈喜琴,管俊.关联规则的相似性度量与聚类研究[J].计算机工程与设计,2012,33(2):745-749. 被引量：7
4屈展,陈雷.一种改进的APRIORI算法在电子商务中的应用[J].西安石油大学学报（自然科学版）,2012,27(1):91-93. 被引量：4
5张劲松,季平.四种颈淋巴清扫术切口在口腔外科中的比较[J].重庆医科大学学报,2000,25(2):208-209. 被引量：2
6聂斌,林剑鸣,杜建强,王卓,叶青,熊玲珠,朱明峰,李智彪,吴友平.基于Apriori算法提取糖尿病并发症数据的关联规则[J].江西中医学院学报,2013,25(1):27-29. 被引量：2
7牛新征,杨健,周明天.基于主观兴趣度的关联规则优化算法[J].四川大学学报（工程科学版）,2013,45(4):131-139. 被引量：2
8李学.关联规则数据挖掘技术在儿童肺炎用药选择的应用[J].中国民族民间医药,2014,23(8):76-76.
9王颖,彭新光,边婧,付东来.云计算下信任反馈可信性评估模型研究[J].计算机工程与设计,2014,35(6):1906-1910. 被引量：5
10王宁,刘海园,周雪珂.基于粗糙集的应急案例中概率规则挖掘方法[J].运筹与管理,2018,27(12):84-94. 被引量：2

1王丽一,文延华.动态二进制翻译中的冗余LOAD删除优化技术[J].计算机应用与软件,2008,25(6):40-43. 被引量：2
2徐金龙,赵荣彩,韩林.分段约束的超字并行向量发掘路径优化算法[J].计算机应用,2015,35(4):950-955. 被引量：11
3刘红梅.消除规则冲突和冗余的关联分类方法研究[J].电脑知识与技术,2009,5(1X):629-630.
4白成林.ADRR局域网语音、数据综合通信协议的性能分析[J].计算机应用与软件,2003,20(4):31-34.
5尹玉冰,孙竞,余宏亮.一种广域网环境下的分布式冗余删除存储系统[J].中兴通讯技术,2010,16(5):20-23. 被引量：1
6程欢,杨庚.地埋成像系统中图像冗余删除算法设计与实现[J].计算机技术与发展,2015,25(3):81-85.
7陈衡.一种基于动态降维的数据约简方法[J].鸡西大学学报（综合版）,2017,17(3):20-24. 被引量：1
8陈旭日,徐炜民.基于描述逻辑的XACML策略研究[J].计算机工程,2013,39(4):71-74. 被引量：1
9干思权,刘贺平,申祝江.一种改进型T-S模糊神经网络[J].控制工程,2005,12(5):442-445. 被引量：9
10张素平,韩林,丁丽丽,王鹏翔.新型超字级并行改进算法[J].计算机应用,2017,37(2):450-456.

小型微型计算机系统

2006年第1期

浏览历史

内容加载中请稍等...

关联规则的冗余删除与聚类被引量：15

参考文献13

二级参考文献15

共引文献77

同被引文献92

引证文献15

二级引证文献50

相关作者

相关机构

相关主题

浏览历史

关联规则的冗余删除与聚类 被引量：15

参考文献13

二级参考文献15

共引文献77

同被引文献92

引证文献15

二级引证文献50

相关作者

相关机构

相关主题

浏览历史

关联规则的冗余删除与聚类被引量：15