期刊文献+

一种改进的数据流最大频繁项集挖掘算法 被引量:4

An improved algorithm for mining maximal frequent itemsets over data streams
下载PDF
导出
摘要 提出了一种基于DSM-MFI算法的改进算法DSMMFI-DS算法,它首先将事务数据按一定的全序关系存入DSFI-list列表中;然后按排序后的顺序存储到类似概要数据结构的树中;接着删除树中和DSFI-list列表中的非频繁项,同时删除窗口衰退支持数大的事务项;最后采用自顶向下和自底向上的双向搜索策略来挖掘数据流的最大频繁项集。通过用例分析和实验表明,该算法比DSM-MFI算法具有更好的执行效率。 Based on the algorithm of DSM-MFI, an improved algorithm, named DSMMFI-DS (Dic tionary Sequence Mining Maximal Frequent Itemsets over Data Streams), is proposed. Firstly, it stores transaction data into DSFI-list in alphabetical order. Secondly, the data are stored sequentially into the tree similar to the summary data structure. Thirdly, non-frequent items in the tree and DSFI-list are re- moved, and the transaction items with the maximum count of window attenuation supports are deleted. Finally, the strategy (top-down and bottom-up two-way search) is used to mine maximal frequent itemsets over data streams, and case analysis and experiments prove that the algorithm DSMMFI-DS has bet- ter performance than the algorithm DSM-MFI.
作者 胡健 吴毛毛
出处 《计算机工程与科学》 CSCD 北大核心 2014年第5期963-970,共8页 Computer Engineering & Science
关键词 数据挖掘 数据流 界标窗口 最大频繁项集 窗口衰减支持数 data mining data stream landmark windows maximal frequent itemsets window attenu-ation support count
  • 相关文献

参考文献14

二级参考文献127

共引文献270

同被引文献51

  • 1于红,王秀坤,孟军.用有序FP-tree挖掘最大频繁项集[J].控制与决策,2007,22(5):520-524. 被引量:7
  • 2Han Jiawei,Kamber Micheline,范明,孟小峰,等译.数据挖掘概念与技术[M].北京:机械工业出版社,2007:424-479.
  • 3SUGUNA N, THANUSHKODI K G. An independent rough set approach hybrid with artificial bee colony al- gorithm for dimensionality reduction[J]. AmericanJournal of Applied Sciences, 2011, 8(3): 261-266.
  • 4WEN Jiahui, ZHONG Mingyang, WANG Zhiying. Ac- tivity recognition with weighted frequent patterns min- ing in smart environments [J]. Expert Systems with Applications, 2015, 42(17): 6423 -6432.
  • 5ZHANG Zheng, TANG Ping, DUAN Rubing. Dynamic time warping under pointwise shape context, Informa- tion Sciences[J]. 2015, 4(315) : 88-101.
  • 6ALTINEL B, GANIZ M C, DIRI B. A corpus-based semantic kernel for text classification by using meaning values of terms [J]. Engineering Applications of Artifi- cial Intelligence, 2015, 43: 54-66.
  • 7CAMPAGNI R, MERLINI D, SPRUGNOLI R, et al. Data mining models for student careers[J]. Expert Svstems with ADDlications, 2015, 42(13): 5508-5521.
  • 8Agrawal R,Imielinske T,Swami A.Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data . 1993
  • 9Liu S H,Liu S J,Chen S X,et al.IOMRA-A high efficiency frequent itemset mining algorithm based on the Map Reduce com putation model. Proceedings-17th IEEE International Conferen ce on Computational Science and Engineering . 2015
  • 10Gunopulos D,Mannila H,Saluja S.Discovering all most spe cific sentence by randomized algorithms. 6th International C onference in Database Theory . 1997

引证文献4

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部