基于SQL的不产生候选集的频繁模式挖掘被引量：1

SQL-based Frequent Pattern Mining Without Candidacy Generation

下载PDF

导出

摘要频繁模式挖掘是数据库挖掘中的一个十分重要的组成部分 ,然而以前的许多研究都是基于Apriori的产生候选集的测试迭代方法。这些方法普遍存在需要多次扫描数据库 ,对产生的大量候选集进行迭代测试的缺陷 ,尤其是对于挖掘长模式时这种缺陷就尤为突出。FP growth方法采用分而治之的策略 ,只需对数据库进行二次扫描 ,而且避免了产生大量候选集的问题。文中的基于SQL的频繁模式挖掘方法既是在此基础上提出的 ,采用子查询及DBMS扩展技术 (如用户定义函数等 )对该方法进行了改进。 A fundamental component in data mining tasks is finding frequent patterns in a given dataset. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However,candidate set is still costly,especially when there aret prolific patterns and/or long patterns. This paper presents an evaluation of SQL based frequent pattern mining with a novel frequent pattern growth (FP-growth) method,which is efficient and scalable for mining both long and short patterns without candidate generation. This paper examines some techniques to improve performance by using DBMS extension and makes performance evaluation on commercial RDBMS (IBM DB2 UDB EEE V8).

作者尚学群沈均毅

机构地区西安交通大学电子与信息工程学院

出处《计算机应用》 CSCD 北大核心 2004年第1期92-95,共4页 journal of Computer Applications

关键词数据挖掘频繁模式 SQL DBMS扩展 data mining frequent pattern SQL DBMS extension

分类号 TP311.132 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1[1]Agrawal R,Shim K. Developing tightly-coupled data mining application on a relational database system[A]. Proceedings of the 2nd International Conference on Knowledge Discovery in Database and Data Mining[C]. Portland,Oregon,1996.
2[2]Agrawal R,Srikant R. Fast algorithms for mining association rules[A]. Proceedings of the 20st VLDB Conference[C]. Santiago,Chile,1994. 487-499.
3[3]Han J,Pei J,Yin Y. Mining frequent patterns without candidate generation[A]. Proceedings of the ACM SIGMOD Conference on Man-agement of data[C]. 2000.
4[4]Park JS,Chen M,Yu PS. An effective hash based algorithm for mining association rules[A]. Proceedings of the ACM SIGMOD Conference on Management of data[C]. 1995. 175-186.
5[5]Savsere A,Omiecinski E,Navathe S. An efficient algorithm for mining association rules in large databases[A]. Proceedings of the 21st VLDB Conference[C]. 1995.
6[6]Sarawagi S,Thomas S,Agrawal R. Integrating mining with relational database systems:alternatives and implications[A]. Proceedings of the ACM SIGMOD Conference on Management of data[C]. Seattle,Washinton,USA,1998.

同被引文献3

1范明孟小峰.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
2谭淑英,李赫男,左贵启.服务器端的动态网站开发技术[J].计算机应用研究,2002,19(5):143-145. 被引量：14
3梅晓勇,孙建平,肖政宏.基于动态规则构造的排课系统设计与实现[J].微机发展,2002,12(6):12-14. 被引量：6

引证文献1

1饶云波,张应辉,周明天.乘务员排班系统模型设计与实现[J].计算机时代,2006(2):29-31. 被引量：1

二级引证文献1

1于蓉,郝蓉慧,关欣怡.考虑公平性的客舱乘务员排班研究[J].科技和产业,2023,23(22):131-135.

1王新昌,汪永伟.基于DBMS扩展的数据库加密系统的设计与实现[J].微计算机信息,2007,23(21):49-50. 被引量：2
2谢若承,程捷.改进DataWindow的打印技术[J].中国计量学院学报,1998,9(1):51-55. 被引量：1
3掌胜国.用Visual Basic创建Excel用户定义函数[J].少年电世界,2002(3):14-16.
4王元珍,魏欢,朱虹,张勇.对象关系型DBMS的关键技术研究[J].计算机应用研究,2004,21(7):64-65. 被引量：10
5李益兵,董成亮,郭顺生.基于用户定义函数的财务接口研究与实现[J].组合机床与自动化加工技术,2005(3):108-109. 被引量：2
6桂云秋,张业展,朱臣.T_sql中的用户自定义函数及其应用[J].科教导刊（电子版）,2016,0(32):156-156.
7杨波,洪晓光,王海洋.利用区间约束优化包含多个用户函数的查询[J].软件学报,2001,12(9):1393-1398. 被引量：1
8曹忠升,游君平,周英飚.DM3用户定义类型的设计[J].计算机工程,2004,30(14):58-59.
9李军,张国柱,雍少为.代码移交测试模型及其应用[J].现代电子技术,2008,31(2):110-112. 被引量：1
10陈曙,叶俊民,张帆.一种自动机学习和符号化执行的软件自动测试方法[J].计算机科学,2013,40(8):161-164. 被引量：1

计算机应用

2004年第1期

浏览历史

内容加载中请稍等...

基于SQL的不产生候选集的频繁模式挖掘被引量：1

参考文献6

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于SQL的不产生候选集的频繁模式挖掘 被引量：1

参考文献6

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于SQL的不产生候选集的频繁模式挖掘被引量：1