基于模式增长的高效用序列模式挖掘算法被引量：2

A New Algorithm for Mining High Utility Sequential Patterns Based on Pattern-growth

下载PDF

导出

摘要高效用序列模式挖掘是数据挖掘领域的一项重要内容,在生物信息学、消费行为分析等方面具有重要的应用.与传统基于频繁项模式挖掘方法不同,高效用序列模式挖掘不仅考虑项集的内外效用,更突出项集的时间序列含义,计算复杂度较高.尽管已经有一定数量的算法被提出应用于解决该类问题,挖掘算法的时空效率依然成为该领域的主要研究热点问题.鉴于此,本文提出一个基于模式增长的高效用序列模式挖掘算法HUSP-FP.依据高效用序列项集必须满足事务效用闭包属性要求,算法首先在去除无用项后建立全局树,进而采用模式增长方法从全局树上获取全部高效用序列模式,避免产生候选项集.在实验环节与目前效率较好的HUSP-Miner、USPAN、HUS-Span三类算法进行了时空计算对比,实验结果表明本文给出算法在较小阈值下仍能有效挖掘到相关序列模式,并且在计算时间和空间使用效率两方面取得了较大的提高. High utility sequential pattern mining is an important research topic in data mining.It plays an important role in many applications,such as bioinformatics and consumer behavior analysis.Different from traditional itemsets mining methods which take the appear numbers or the utility of the itemsets into account,high utility sequential pattern mining not only concerns the inner and outer utility but also the sequences of the items in the transactions,its computational complexity increases compared to the single frequent itemsets mining or high utility itemsets mining.Although a number of algorithms have been proposed to solve such problem,the efficiency of mining algorithms is still the main research topic in this field.In view of this,this paper proposes an efficient high utility sequential pattern mining algorithm named HUSP-FP based on pattern growth.Because the transaction utility value of the sequence itemset should satisfy the downward closure property,a global tree is established based on the sequential items in the transactions of the dataset after removing useless items.the HUSP-FP algorithm can efficiently extract sequential patterns from global tree without generating candidate itemsets.Comparing with the state-of-the-art high-utility sequential pattern mining algorithms of HUSP-Miner、USPAN and HUS-Span in our experiments,the proposed HUSP-FP out-performed its counterpart significantly.

作者唐辉军王乐樊成立 TANG Hui-Jun;WANG Le;FAN Cheng-Li(School of Finance and Information,Ningbo University of Finance and Economics,Ningbo 315175;School of Digital Technology and Engineering,Ningbo University of Finance and Economics,Ningbo 315175)

机构地区宁波财经学院金融与信息学院宁波财经学院数字技术与工程学院

出处《自动化学报》 EI CAS CSCD 北大核心 2021年第4期943-954,共12页 Acta Automatica Sinica

基金浙江省公益技术应用研究计划项目(LGF19H180002,2017C35014) 宁波市自然科学基金项目(2017A610122) 慈溪市社会发展科技计划项目(CN2018001)资助。

关键词高效用序列模式模式增长闭包属性数据挖掘 High utility sequential pattern pattern growth downward closure property data mining

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献1

1王乐,熊松泉,常艳芬,王水.基于模式增长方式的高效用模式挖掘算法[J].自动化学报,2015,41(9):1616-1626. 被引量：10

二级参考文献23

1潘云鹤,王金龙,徐从富.数据流频繁模式挖掘研究进展[J].自动化学报,2006,32(4):594-602. 被引量：34
2Krishnamoorthy S. Pruning strategies for mining high utility itemsets. Expert Systems with Applications, 2015, 42(5): 2371-2381.
3Lan G C, Hong T P, Tseng V S, Wang S L. Applying the maximum utility measure in high utility sequential pattern mining. Expert Systems with Applications, 2014, 41(11): 5071-5081.
4Lin C W, Hong T P, Lan G C, Wong J W, Lin W Y. Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases. Advanced Engineering Informatics, 2015, 29(1): 16-27.
5Lin C W, Lan G C, Hong T P. Mining high utility itemsets for transaction deletion in a dynamic database. Intelligent Data Analysis , 2015, 19(1): 43-255.
6Manike C, Om H. Sliding-window based method to discover high utility patterns from data streams. Computational Intelligence in Data Mining. India: Springer, 2015. 173-184.
7Yun U, Ryang H, Ryu K H. High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Systems with Applications, 2014, 41(8): 3861-3878.
8Zihayat M, An A J. Mining top-k high utility patterns over data streams. Information Sciences, 2014, 285: 138-161.
9Fournier-Viger P, Wu C W, Zida S, Tseng V S. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. Foundations of Intelligent Systems. Switzerland: Springer, 2014. 83-92.
10Yao H, Hamilton H J, Butz G J. A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM International Conference on Data Mining (ICDM 2004). Lake Buena Vista, FL, United States: Springer, 2004. 482-486.

共引文献9

1袁二毛,郭丹,胡学钢,吴信东.基于打分矩阵的生物序列频繁模式挖掘[J].模式识别与人工智能,2016,29(10):894-906. 被引量：3
2李同轩,董祥军.高效用频繁模式挖掘技术研究[J].齐鲁工业大学学报,2017,31(1):45-50. 被引量：1
3谢志轩,李玉强.一种改进的流数据上的高效用模式挖掘算法[J].小型微型计算机系统,2017,38(9):2080-2085. 被引量：3
4吴倩,王林平,罗相洲,崔建群,王海.一种快速挖掘top-k高效用模式的算法[J].计算机应用研究,2017,34(11):3303-3307. 被引量：5
5张全贵,曹阳,李志强.一种频率约束的高效用模式挖掘算法[J].计算机应用与软件,2018,35(11):266-271. 被引量：1
6靳晓乐,刘峡壁,马骁.基于双重二元粒子群优化的高效用项集挖掘算法[J].计算机工程,2018,44(12):202-207. 被引量：3
7高曼,韩萌,雷冰冰.高效用模式产生策略综述[J].计算机工程与应用,2020,56(16):1-12. 被引量：4
8杨海军,张博岚,路永华.基于共现结构的频繁高效用项集挖掘算法[J].辽宁大学学报（自然科学版）,2022,49(1):22-29. 被引量：1
9王乐,王水,刘胜蓝,王辉兵.基于索引树的带通配符序列模式挖掘算法[J].计算机学报,2019,42(3):554-565. 被引量：5

同被引文献8

1吴倩,王林平,罗相洲,崔建群,王海.一种快速挖掘top-k高效用模式的算法[J].计算机应用研究,2017,34(11):3303-3307. 被引量：5
2Thu-Lan DAM,Kenli LI,Philippe FOURNIER-VIGER,Quang-Huy DUONG.CLS-Miner: efficient and effective closed high-utility itemset mining[J].Frontiers of Computer Science,2019,13(2):357-381. 被引量：10
3高曼,韩萌,雷冰冰.高效用模式产生策略综述[J].计算机工程与应用,2020,56(16):1-12. 被引量：4
4王少峰,韩萌,贾涛,张春砚,孙蕊.数据流高效用模式挖掘综述[J].计算机应用研究,2020,37(9):2571-2578. 被引量：5
5张春砚,韩萌,孙蕊,杜诗语,申明尧.高效用模式挖掘关键技术综述[J].计算机应用研究,2021,38(2):330-340. 被引量：5
6孙蕊,韩萌,张春砚,申明尧,杜诗语.精简高效用模式挖掘综述[J].计算机应用研究,2021,38(4):975-981. 被引量：3
7程浩东,韩萌,张妮,李小娟,王乐.基于滑动窗口模型的数据流闭合高效用项集挖掘[J].计算机研究与发展,2021,58(11):2500-2514. 被引量：14
8单芝慧,韩萌,韩强.动态数据上的高效用模式挖掘综述[J].计算机应用,2022,42(1):94-108. 被引量：5

引证文献2

1李慕航,韩萌,陈志强,武红鑫,张喜龙.面向复杂高效用模式的挖掘算法综述[J].广西师范大学学报（自然科学版）,2022,40(3):13-30. 被引量：1
2单芝慧,韩萌,韩强.增量数据上的闭合定量高效用项集挖掘算法[J].计算机应用,2023,43(7):2049-2056. 被引量：1

二级引证文献2

1刘淑娟,韩萌,高智慧,穆栋梁,李昂.衍生高效用模式挖掘算法综述[J].燕山大学学报,2024,48(2):138-156.
2解海燕,李杰,赵国栋.非结构化高维大数据异常流量时间点挖掘算法[J].计算机仿真,2024,41(7):474-478.

1马克·埃夫隆,里亚姆·奥尔特.如何找到一招制敌的增长方法[J].销售与管理,2021(7):39-47.
2关于参考文献著录格式要求[J].中华医学杂志,2021,101(13):925-925.
3关于参考文献著录格式要求[J].中华医学杂志,2021,101(11):812-812.
4高鑫,王世杰,许舒翔.基于并行算法的大数据阶乘算法的时间效率优化分析[J].微型电脑应用,2021,37(1):168-169.
5计算机科学与探索 2008年第2卷总目次[J].计算机科学与探索,2008,2(6).
6马希.“她经济”下电商平台中女性消费趋势与消费行为分析[J].现代营销（下）,2021(2):72-73. 被引量：2
7龙会典,雷燕婷.基于结构方程模型中产阶层消费行为分析[J].经营与管理,2021(3):71-75. 被引量：1
8马晓钰,刘健强.产业结构调整、环境规制与经济增长——基于中国2005—2017年30个省域面板数据的实证研究[J].生态经济,2021,37(5):65-71. 被引量：7
9吴海洋,郭波,缪巍巍,丁士长.基于改进GSP的电力通信告警关联挖掘研究[J].计算机与数字工程,2021,49(3):542-545. 被引量：3
10《微计算机应用》2007年总目录[J].微计算机应用,2007,28(12):1339-1344.

自动化学报

2021年第4期

浏览历史

内容加载中请稍等...

基于模式增长的高效用序列模式挖掘算法被引量：2

参考文献1

二级参考文献23

共引文献9

同被引文献8

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于模式增长的高效用序列模式挖掘算法 被引量：2

参考文献1

二级参考文献23

共引文献9

同被引文献8

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于模式增长的高效用序列模式挖掘算法被引量：2