期刊文献+

基于模式增长的高效用序列模式挖掘算法 被引量:2

A New Algorithm for Mining High Utility Sequential Patterns Based on Pattern-growth
下载PDF
导出
摘要 高效用序列模式挖掘是数据挖掘领域的一项重要内容,在生物信息学、消费行为分析等方面具有重要的应用.与传统基于频繁项模式挖掘方法不同,高效用序列模式挖掘不仅考虑项集的内外效用,更突出项集的时间序列含义,计算复杂度较高.尽管已经有一定数量的算法被提出应用于解决该类问题,挖掘算法的时空效率依然成为该领域的主要研究热点问题.鉴于此,本文提出一个基于模式增长的高效用序列模式挖掘算法HUSP-FP.依据高效用序列项集必须满足事务效用闭包属性要求,算法首先在去除无用项后建立全局树,进而采用模式增长方法从全局树上获取全部高效用序列模式,避免产生候选项集.在实验环节与目前效率较好的HUSP-Miner、USPAN、HUS-Span三类算法进行了时空计算对比,实验结果表明本文给出算法在较小阈值下仍能有效挖掘到相关序列模式,并且在计算时间和空间使用效率两方面取得了较大的提高. High utility sequential pattern mining is an important research topic in data mining.It plays an important role in many applications,such as bioinformatics and consumer behavior analysis.Different from traditional itemsets mining methods which take the appear numbers or the utility of the itemsets into account,high utility sequential pattern mining not only concerns the inner and outer utility but also the sequences of the items in the transactions,its computational complexity increases compared to the single frequent itemsets mining or high utility itemsets mining.Although a number of algorithms have been proposed to solve such problem,the efficiency of mining algorithms is still the main research topic in this field.In view of this,this paper proposes an efficient high utility sequential pattern mining algorithm named HUSP-FP based on pattern growth.Because the transaction utility value of the sequence itemset should satisfy the downward closure property,a global tree is established based on the sequential items in the transactions of the dataset after removing useless items.the HUSP-FP algorithm can efficiently extract sequential patterns from global tree without generating candidate itemsets.Comparing with the state-of-the-art high-utility sequential pattern mining algorithms of HUSP-Miner、USPAN and HUS-Span in our experiments,the proposed HUSP-FP out-performed its counterpart significantly.
作者 唐辉军 王乐 樊成立 TANG Hui-Jun;WANG Le;FAN Cheng-Li(School of Finance and Information,Ningbo University of Finance and Economics,Ningbo 315175;School of Digital Technology and Engineering,Ningbo University of Finance and Economics,Ningbo 315175)
出处 《自动化学报》 EI CAS CSCD 北大核心 2021年第4期943-954,共12页 Acta Automatica Sinica
基金 浙江省公益技术应用研究计划项目(LGF19H180002,2017C35014) 宁波市自然科学基金项目(2017A610122) 慈溪市社会发展科技计划项目(CN2018001)资助。
关键词 高效用序列模式 模式增长 闭包属性 数据挖掘 High utility sequential pattern pattern growth downward closure property data mining
  • 相关文献

参考文献1

二级参考文献23

  • 1潘云鹤,王金龙,徐从富.数据流频繁模式挖掘研究进展[J].自动化学报,2006,32(4):594-602. 被引量:34
  • 2Krishnamoorthy S. Pruning strategies for mining high utility itemsets. Expert Systems with Applications, 2015, 42(5): 2371-2381.
  • 3Lan G C, Hong T P, Tseng V S, Wang S L. Applying the maximum utility measure in high utility sequential pattern mining. Expert Systems with Applications, 2014, 41(11): 5071-5081.
  • 4Lin C W, Hong T P, Lan G C, Wong J W, Lin W Y. Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases. Advanced Engineering Informatics, 2015, 29(1): 16-27.
  • 5Lin C W, Lan G C, Hong T P. Mining high utility itemsets for transaction deletion in a dynamic database. Intelligent Data Analysis , 2015, 19(1): 43-255.
  • 6Manike C, Om H. Sliding-window based method to discover high utility patterns from data streams. Computational Intelligence in Data Mining. India: Springer, 2015. 173-184.
  • 7Yun U, Ryang H, Ryu K H. High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Systems with Applications, 2014, 41(8): 3861-3878.
  • 8Zihayat M, An A J. Mining top-k high utility patterns over data streams. Information Sciences, 2014, 285: 138-161.
  • 9Fournier-Viger P, Wu C W, Zida S, Tseng V S. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. Foundations of Intelligent Systems. Switzerland: Springer, 2014. 83-92.
  • 10Yao H, Hamilton H J, Butz G J. A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM International Conference on Data Mining (ICDM 2004). Lake Buena Vista, FL, United States: Springer, 2004. 482-486.

共引文献9

同被引文献8

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部