关联规则挖掘中改进型Diffsets算法被引量：1

Improved Diffsets Algorithm in Association Rules Mining

下载PDF

导出

摘要频繁项集挖掘是关联规则挖掘中至关重要的一步。对于稠密数据集的频繁项集挖掘,传统的挖掘算法往往产生大量无用的中间结果,造成内存利用率的极大浪费,尤其是在支持度较低的情况下。Diffsets算法通过引入"差集"的概念,在一定程度上解决了挖掘过程中产生的大量中间结果与内存容量之间的矛盾。改进型Diffsets算法是在原算法的基础上,在差集运算过程中根据差集中所包含的事务标识个数进行递减排序,进一步减少了挖掘过程中产生的中间结果数量。分析与实例表明,改进后的算法在执行过程中将占用更少的内存空间,加快了算法的收敛速度。 Mining frequent items is a key step in association rules mining. As to the mining frequent items of dense datasets, the traditional mining algorithm always turn out a great deal of useless intermediate results which occupies a large proportion of the memory, especially in a low values of support. Diffsets algorithm introduces the conception of differences,and to some extent,it provides a solution of dealing with the contradiction between those multiintermediate results and the memory capacity. This improved Diffsets algorithm on the basis of original algorithm ranks the number of tids in a degressive way during the the calculation course,in this way,the amount of intermidiate results can be decreased. The analysis and examples show that this imporved algorithm takes less memory space in the operation process and accelerates the convergence pace of the algorithm.

作者孙志长冯祖洪

机构地区北方民族大学计算机科学与工程学院

出处《现代电子技术》 2008年第22期80-83,87,共5页 Modern Electronics Technique

基金宁夏自然科学基金资助项目(NZ0697) 宁夏高等学校科学技术研究项目(2006JY018)

关键词数据挖掘关联规则挖掘频繁项集挖掘 Diffsets data mining association rules mining mining frequent items Diffsets

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献11

1PANG NING TAN,MICHAEL STEINBACH,VIPIN KUMAR,数据挖掘导论[M].范明,范宏建,译.北京:人民邮电出版社,2006.
2Agrawal R, Mannila H, Srikant R, et al. Fast Discovery of Association Rules[J]. Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA, 1996 : 307 - 328.
3Rakesh C Agrawal, Charu C Aggarwal, Prasad V V V. Depth First Generation of Long Patterns[J]. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD- 2000),2000:108 - 118.
4Bayardo R J. Efficiently Mining Long Patterns from Databases Proc[C]. 1998 ACM- SIGMOD Int'l Conf. Management of Data (SIGMOD '98), 1998:85 - 93.
5Zaki M J, Gouda K. Fast Vertical Mining Using Diffsets[Z]. In Proc. of ACM SIGKDD'03. Washington,DC: 2003.
6Zaki M J. Scalable Algorithms for Association Mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2000,12(3):372 - 390.
7Zaki M J. Generating Non- redundant Association Rules [C]. In 6th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2000.
8Omiecinsky E, Sarasere A, Navathe S. An Efficient Algorithm for Mining Association Rules in Large Databases[J].In Proc. of the 21 st VLDB Conference, Zurich, Switzerland, 1995:432 - 444.
9Shenoy P, Haritsa J R, Sudarshan S, et al. Turbo - charging Vertical Mining of Large Databases[C]. In ACM SIGMOD Intl. Conf. Management of Data, 2000.
10Zaki M J. Scalable Algorithms for Association Mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2000,12(3) : 372 - 390.

二级参考文献6

1胡吉明,鲜学丰.挖掘关联规则中Apriori算法的研究与改进[J].计算机技术与发展,2006,16(4):99-101. 被引量：59
2Agrawal R,Imielinski T,Swami A.Mining Association Rules between Sets of Items in Large Database[A].Proceedings of the 1993 ACM -SIGMOD International Conference on Management of Data[C].Washington DC,USA,1993:207-216.
3Agrawal R,Srikant R.Fast Algorithm for Mining Association rules[A].In:Proceedings of the 20th International Conference on VIDB[C].Santiago,Chile,1994:487-499.
4Brian Lent,Arun N Swami,Jennifer Widom.Clustering Association Rules Alex Gray,Per-Ake Larson.Proceedings of the Thirteenth International Conference on Data Engineering.(ICDE'97),Birmingham,England,1997.IEEE.Computer.Society.Press Publisher,1997:220-231.
5冯玉才,冯剑琳.关联规则的增量式更新算法[J].软件学报,1998,9(4):301-306. 被引量：227
6蔡之华,颜雪松,李晖.挖掘关联规则的并行算法研究[J].计算机应用研究,2002,19(2):9-11. 被引量：7

共引文献13

1娄会东,苏瑞,金建军.基于模式与规则寻找的数据挖掘研究[J].河南理工大学学报（自然科学版）,2007,26(4):467-471. 被引量：1
2宋宇辰,宋飞燕,孟海东.基于密度复杂簇聚类算法研究与实现[J].计算机工程与应用,2007,43(35):162-165. 被引量：16
3田杰,周晓娟,吕建新.数据挖掘中聚类算法比较及在武警网络中的应用研究[J].现代电子技术,2008,31(8):115-117.
4孟海东,宋飞燕,宋宇辰.面向复杂簇的聚类算法研究与实现[J].计算机应用与软件,2008,25(10):32-34. 被引量：4
5石鹏,宇仁德,刘芳.基于Apriori算法的交通事故关联规则挖掘[J].农业装备与车辆工程,2009,47(2):11-13. 被引量：4
6张延龙,王建兰.数据挖掘技术在公共气象服务中的应用[J].现代电子技术,2009,32(16):80-82. 被引量：8
7周文鹏.基于数据挖掘的个性化网页推送服务模式研究[J].计算机与数字工程,2010,38(8):58-61. 被引量：4
8李彦伟,戴月明,王金鑫.多最小支持度的加权关联规则挖掘算法[J].计算机工程与设计,2011,32(3):955-957. 被引量：2
9罗光春,狄翠霞,李炯.新型用户访问模式挖掘方法研究[J].电子科技大学学报,2012,41(1):70-73.
10沈思.个性化借阅服务的关联规则挖掘[J].科技情报开发与经济,2013,23(18):21-22.

同被引文献2

1邹丽,孙辉,李浩.分布式系统下挖掘关联规则的两种方案[J].计算机应用研究,2006,23(1):77-78. 被引量：11
2孙志长,冯祖洪,王沛栋.一种高效的混合压缩数据挖掘算法[J].计算机应用研究,2009,26(10):3738-3742. 被引量：6

引证文献1

1孙小杰.垂直数据挖掘的算法研究分析[J].计算机光盘软件与应用,2014,17(4):123-124.

1明媚,缪裕青,李世令,李云辉.垂直分布下的隐私保护频繁闭合项集挖掘算法[J].桂林电子科技大学学报,2014,34(4):295-299.
2代月明,朱习军,刘连玉.基于集体度一置信度的关联规则挖掘[J].青岛建筑工程学院学报,2005,26(2):74-77. 被引量：2
3王意洁,王勇军,王志英,胡守仁.面向对象数据库中的事务标识分配策略[J].计算机工程,1999,25(2):39-41. 被引量：2
4王意洁,王勇军,胡守仁.事务标识的研究与实现[J].软件学报,1999,10(7):724-729.
5殷彬,方思行.临床数据中挖掘关联规则算法的选用[J].暨南大学学报（自然科学与医学版）,2004,25(1):26-29. 被引量：2
6翟悦,郭杨,王玉姣.一种利用差集的加权频繁项集挖掘算法[J].辽宁工程技术大学学报（自然科学版）,2016,35(3):312-317. 被引量：3
7康丽萍,许光銮,孙显.受限玻尔兹曼机的稀疏化特征学习[J].计算机科学,2016,43(12):91-96. 被引量：3
8张素琪,梁志刚,胡利娟,董永峰.改进的多维关联规则算法研究及应用[J].计算机工程与科学,2012,34(9):174-179. 被引量：10
9李海峰,章宁,柴艳妹.不确定性数据上频繁项集挖掘的预处理方法[J].计算机科学,2012,39(7):161-164. 被引量：10
10王强.基于事务标识列表的关联规则挖掘算法[J].现代图书情报技术,2008(8):63-69. 被引量：4

现代电子技术

2008年第22期

浏览历史

内容加载中请稍等...

关联规则挖掘中改进型Diffsets算法被引量：1

参考文献11

二级参考文献6

共引文献13

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

关联规则挖掘中改进型Diffsets算法 被引量：1

参考文献11

二级参考文献6

共引文献13

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

关联规则挖掘中改进型Diffsets算法被引量：1