期刊文献+

关联规则挖掘中若干关键技术的研究 被引量:62

Study of Some Key Techniques in Mining Association Rule
下载PDF
导出
摘要 Apriori类算法已经成为关联规则挖掘中的经典算法,其技术难点及运算量主要集中在以下两个方面:①如何确定候选频繁项目集和计算项目集的支持数;②如何减少候选频繁项目集的个数以及扫描数据库的次数·目前已提出了许多改进方法来解决第2个问题,并已取得了很好的效果·然而,对于第1个问题,仍沿用Apriori算法中的解决方案,其运算量是较大的·为此,提出了一种基于二进制形式的候选频繁项目集生成和相应的计算支持数算法,该算法只需对挖掘对象进行一些“或”、“与”、“异或”等逻辑运算操作,显著降低了算法的实现难度,将该算法与Apriori类算法相结合,可以进一步提高算法的执行效率,实验结果也表明算法是有效、快速的· The apriori algorithm has become a classic method for mining association rules. The difficulties and operation quantity of the apriori algorithm consist of the following two aspects: (1) how to generate candidate frequent itemsets and to calculate its support, (2) how to reduce the size of candidate frequent itemsets and times of accessing I/O. At present, there are many methods that can solve the second problems very well. However, very few methods have been presented to solve the first problem. An efficient and fast algorithm based on binary format for discovering candidate frequent itemsets and calculating the support of itemsets is proposed, which only executes some logical operation. A performance comparison of this algorithm with the apriori-like algorithms is given, and the experiments show that the new algorithm is more efficient.
出处 《计算机研究与发展》 EI CSCD 北大核心 2005年第10期1785-1789,共5页 Journal of Computer Research and Development
基金 江苏大学科研启动基金项目(04KJD001) 国家自然科学基金项目(70371015)
关键词 数据挖掘 关联规则 频繁项目集 data mining association rules frequent itemsets
  • 相关文献

参考文献12

  • 1R. Agrawal, T. Imielinski, A. Swami. Mining association rules between sets of items in large databases. ACM SIGMOD Int'l Conf. Management of Data, Washington, D. C., 1993.
  • 2Han J, Kamber. MData Mining: Concepts and Techniques.Beijing: High Education Press, 2001.
  • 3B. Goethals. Survey of frequent pattern mining. Helsinki Institute for Information Technology, Technical Report, 2003.
  • 4R. Agrawal, R. Srikant. Fast algorithm for mining association rules. The 20th Int'l Conf. VLDB, Santiago, Chile, 1994.
  • 5M. Houtsma, A. Swami. Set-oriented mining for association rules in relational databases. In: Yu P., Chen A, eds. Proc. Int'l Conf. Data Engineering. Los Alamitos, CA: IEEE Computer Society Press, 1995. 25~33.
  • 6A. Savasere, E. Omiecinski, S. Navathe. An efficient algorithm for mining association rules. The 21st Int' l Conf. VLDB, Zurich,Switzerland, 1995.
  • 7J. Han, Y. Fu. Discovery of multiple-level association rules from large databases. The 21st Int'l Conf. VLDB, Zurich,Switzerland, 1995.
  • 8R. Bayardo. Efficiently mining long patterns from databases. In:L. M. Haas, A. Tiwary, eds. Proc. ACM SIGMOD Int'l Conf.Management of Data. New York: ACM Press, 1998. 85~93.
  • 9Lin, Dao-I, Z. M. Kedem. Pincer-Search: A new algorithm for discovering the maximum frequent set. In: H. J. Schek, F.Saltor, I. Ramos et al. eds. Proc. 6th European Conf.Extending Database Technology. Berlin: Springer-Veriag, 1998.105~119.
  • 10朱玉全,孙志挥,赵传申.快速更新频繁项集[J].计算机研究与发展,2003,40(1):94-99. 被引量:63

二级参考文献16

  • 1Jhan M Kamber著 范明 孟小峰等译.数据挖掘:概念与技术[M].北京:机械工业出版社,2001..
  • 2[1]R Agrawal, T Imielinski, A Swami. Mining association rules between sets of items in large databases. In: Peter Buneman, Sushil Ajodia eds. Proc of ACM SIGMOD Conf on Management of Data, New York: ACM Press, 1993. 207~216
  • 3[2]J Han, J Pei. Mining frequent patterns by pattern-growth: Methodology and implications. ACM SIGKDD Explorations (Special Issue on Scalable Data Mining Algorithms), 2000, 2(2): 14~20
  • 4[3]J Han, J Pei, Y Yin. Mining frequent patterns without candidate generation, In: M Dunham, J Naughton, W Chen eds. Proc of 2000 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'00). Dallas, TX, New York: ACM Press, 2000. 1~12
  • 5[4]J Roberto, Jr Bayardo. Efficiently mining long patterns from databases. In: Ashutosh Tiwary, Boeing Co eds. Proc of the 1998 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'98), New York: ACM Press, 1998. 85~93
  • 6[5]D-I Lin, Z M Kedem. Pincer-Search: A new algorithm for discovering the maximum frequent set. In: Bertram Ludscher, Wolfgang May eds. Proc of the 6th European Conf on Extending database technology, Proceedings, Lecture Note in Computer 1377. Berlin: Springer 1998, 1998. 105~119
  • 7[6]Z Pawlak. Rough Sets Theoretical Aspects of Reasoning about Data. Holland: Kluwer Academic Publishers, 1991
  • 8[7]Hu Xiaohua. Knowledge discovery in database: An attribute-oriented rough set approach [Dissertation]. University of Regina, Canada, 1995
  • 9[8]Collections of data for developing, evaluating, and comparing learning methods. 2001. http://www.cs.toronto.edu/~delve/data/mushrooms/desc.html
  • 10R Agrawal, T Imielinski, A Swami. Mining association rules between sets of items in large databases. The ACM SIGMOD Int'l Conf Management of Data,Washington D C, 1993

共引文献80

同被引文献399

引证文献62

二级引证文献154

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部