GTK:A Hybrid-Search Algorithm of Top-Rank-k Frequent Patterns Based on Greedy Strategy 被引量：1

下载PDF

导出

摘要 Currently,the top-rank-k has been widely applied to mine frequent patterns with a rank not exceeding k.In the existing algorithms,although a level-wise-search could fully mine the target patterns,it usually leads to the delay of high rank patterns generation,resulting in the slow growth of the support threshold and the mining efficiency.Aiming at this problem,a greedy-strategy-based top-rank-k frequent patterns hybrid mining algorithm(GTK)is proposed in this paper.In this algorithm,top-rank-k patterns are stored in a static doubly linked list called RSL,and the patterns are divided into short patterns and long patterns.The short patterns generated by a rank-first-search always joins the two patterns of the highest rank in RSL that have not yet been joined.On the basis of the short patterns satisfying specific conditions,the long patterns are extracted through level-wise-search.To reduce redundancy,GTK improves the generation method of subsume index and designs the new pruning strategies of candidates.This algorithm also takes the use of reasonable pruning strategies to reduce the amount of computation to improve the computational speed.Real datasets and synthetic datasets are adopted in experiments to evaluate the proposed algorithm.The experimental results show the obvious advantages in both time efficiency and space efficiency of GTK.

作者 Yuhang Long Wensheng Tang Bo Yang Xinyu Wang Hua Ma Hang Shi Xueyu Cheng

机构地区 Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing College of Information Science and Engineering Clayton State University

出处《Computers, Materials & Continua》 SCIE EI 2020年第6期1445-1469,共25页 计算机、材料和连续体（英文）

基金 This research was supported in part by the Hunan Province’s Strategic and Emerging Industrial Projects under Grant 2018GK4035 in part by the Hunan Province’s Changsha Zhuzhou Xiangtan National Independent Innovation Demonstration Zone projects under Grant 2017XK2058 in part by the National Natural Science Foundation of China under Grant 61602171 in part by the Scientific Research Fund of Hunan Provincial Education Department under Grant 17C0960 and 18B037.

关键词 Top-rank-k frequent patterns greedy strategy hybrid-search

分类号 TN9 [电子电信—信息与通信工程]

引文网络
相关文献

参考文献2

1DENG ZhiHong,WANG ZhongHui,JIANG JiaJian.A new algorithm for fast mining frequent itemsets using N-lists[J].Science China(Information Sciences),2012,55(9):2008-2030. 被引量：25
2Daniel Rivera Ruiz,Alisha Sawant.Quantitative Analysis of Crime Incidents in Chicago Using Data Analytics Techniques[J].Computers, Materials & Continua,2019(5):389-396. 被引量：1

二级参考文献32

1HaHan J W, Pei J, Yin Y W. Mining frequent itemsets without candidate generation. In: The 2000 ACM SIGMOD International Conference on Management of data (SIGMOD’00), New York, 2000. 1-12.
2AgAgrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: The 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93), Washington, 1993. 207-216.
3HaHan J, Cheng H, Xin D, et al. Frequent itemset mining: current status and future directions. Data Min Knowl Discov,2007, 15: 55-86.
4BaBaralis E, Cerquitelli T, Chiusano S. IMine: index support for item set mining. IEEE TKDE J, 2009, 21: 493-506.
5ZaZaki M J, Gouda K. Fast vertical mining using diffsets, In: The 9th ACM SIGKDD International Conference on. Knowledge Discovery and Data Mining (SIGKDD’03), Washington, 2003. 326-335.
6DeDeng Z H, Wang Z H. A new fast vertical method for mining frequent itemsets. Int J Comput Intell Syst, 2010, 3:733-744.
7AgAgrawal R, Srikant R. Fast algorithm for mining Association rules. In: The 20th International Conference on Very Large Data Bases (VLDB’94), Santiago de Chile, 1994. 487-499.
8SaSavasere A, Omiecinski E, Navathe S. An efficient algorithm for mining association rules in large databases. In: The21th International Conference on Very Large Data Bases (VLDB’95), Zurich, 1995. 432-443.
9ShShenoy P, Haritsa J R, Sundarshan S, et al. Turbo-charging vertical mining of large databases. In: ACM International Conference on Management of Data and Symposium on Principles of Database Systems (SIGMOD’00), Dallas, 2000.22-33.
10ZZaki M J. Scalable algorithms for association mining. IEEE TKDE J, 2000, 12: 372-390.

共引文献24

1沈戈晖,刘沛东,邓志鸿.NB-MAFIA:基于N-List的最长频繁项集挖掘算法[J].北京大学学报（自然科学版）,2016,52(2):199-209. 被引量：5
2徐永秀,刘旭敏,徐维祥.基于间隔链表改进的频繁项集挖掘算法[J].计算机应用,2016,36(4):997-1001. 被引量：4
3方炜,李万清,俞东进,袁友伟,黄东发.时空大数据的伴随车高效挖掘算法研究[J].工业控制计算机,2016,29(3):18-20. 被引量：1
4吴惠明,杨威,姜芃,高新闻.基于FP-Growth算法的盾构掘进参数与隧道管片渗漏关联性分析[J].隧道建设,2016,36(5):513-517. 被引量：4
5陈奇,张曦煌.基于N-list的并行频繁项集挖掘算法[J].微电子学与计算机,2017,34(5):40-44.
6谭龙,秦琦冰.基于dSFO-Set的可消除项集挖掘算法[J].计算机工程与设计,2017,38(6):1496-1502.
7李校林,杜托,刘彪.基于B-list的快速频繁模式挖掘算法[J].计算机应用,2017,37(8):2357-2361. 被引量：6
8翟悦,王璨,孙建言.一种改进的基于N-List的频繁项集挖掘算法[J].计算机应用与软件,2018,35(9):67-72. 被引量：6
9李校林,杜托,谢勇.基于Hadoop的大数据频繁模式挖掘算法[J].微电子学与计算机,2018,35(9):14-19. 被引量：9
10孙俊,张曦煌.基于节点集Top-k频繁模式挖掘算法[J].计算机工程与应用,2017,53(6):101-105. 被引量：2

同被引文献1

1Haijiang Liu,Lianwei Cui,Xuebin Ma,Celimuge Wu.Frequent Itemset Mining of User’s Multi-Attribute under Local Differential Privacy[J].Computers, Materials & Continua,2020(10):369-385. 被引量：2

引证文献1

1Ham Nguyen,Tuong Le.A Fast Algorithm for Mining Top-Rank-k Erasable Closed Patterns[J].Computers, Materials & Continua,2022(8):3571-3583.

1Jin-Yu Zhan,Yi-Xin Li,Wei Jiang,Jun-Huan Yang.Utilization-Aware Data Variable Allocation on NVM- Based SPM in Real-Time Embedded Systems[J].Journal of Electronic Science and Technology,2021,19(2):163-172.
2Zhou Zhang,Pei-Quan Jin,Xiao-Liang Wang,Yan-Qi Lv,Shou-Hong Wan,Xi-Ke Xie.COLIN:A Cache-Conscious Dynamic Learned Index with High Read/Write Performance[J].Journal of Computer Science & Technology,2021,36(4):721-740. 被引量：1
3CAI Jiaqi.Measurement and Characteristics of Employment Centers in Shenzhen:A Study Using Enterprise Survey Data in 2017[J].Journal of Landscape Research,2021,13(3):51-56.
4Iliana Bersani,Fiammetta Piersigilli,Giulia Iacona,Immacolata Savarese,Francesca Campi,Andrea Dotta,Cinzia Auriti,Enrico Di Stasio,Matteo Garcovich.Incidence of umbilical vein catheter-associated thrombosis of the portal system: A systematic review and meta-analysis[J].World Journal of Hepatology,2021,13(11):1802-1815. 被引量：3
5Binh Thai Pham,Abolfazl Jaafari,Trung Nguyen-Thoi,Tran Van Phong,Huu Duy Nguyen,Neelima Satyam,Md Masroor,Sufia Rehman,Haroon Sajjad,Mehebub Sahana,Hiep Van Le,Indra Prakash.Ensemble machine learning models based on Reduced Error Pruning Tree for prediction of rainfall-induced landslides[J].International Journal of Digital Earth,2021,14(5):575-596.
6Tao Li,Yongzhen Ren,Yongjun Ren,Jinyue Xia.An Improved Algorithm for Mining Correlation Item Pairs[J].Computers, Materials & Continua,2020(10):337-354.
7Dandan Peng,Le Sun.A Database-Driven Algorithm for Building Top-k Service-Based Systems[J].Journal of Quantum Computing,2020,2(4):171-179.
8Hang Chen,Jianan Feng,Minwei Jiang,Yiqun Wang,Jie Lin,Jiubin Tan,Peng Jin.Diffractive Deep Neural Networks at Visible Wavelengths[J].Engineering,2021,7(10):1483-1491. 被引量：9
9Zhichun Jia,Qiuyang Han,Yanyan Li,Yuqiang Yang,Xing Xing.Prediction of Web Services Reliability Based on Decision Tree Classification Method[J].Computers, Materials & Continua,2020(6):1221-1235. 被引量：3
10Jiaming Liang,Shengze Cai,Chao Xu,Jian Chu.Filtering enhanced tomographic PIV reconstruction based on deep neural networks[J].IET Cyber-Systems and Robotics,2020,2(1):43-52. 被引量：4

Computers, Materials & Continua

2020年第6期

浏览历史

内容加载中请稍等...