改进的PrefixSpan算法在Web挖掘中的应用被引量：2

Application of Improved PrefixSpan Algorithm in Web Mining

下载PDF

导出

摘要针对PrefixSpan算法不足,采用修改Prefix策略与舍弃非频繁项的方法,减少内存与外存之间频繁地交换,减小在挖掘过程中产生的投影数据库规模,降低构建、扫描投影数据库的时空耗费,从而改进算法。实验结果表明,在长序列模式挖掘中,算法在改进后运行效率比原来提高35%以上,更适用于Web挖掘。 Generating frequent itemsets is a critical step in association rule mining. Through the analysis of Apriori algorithm, a new algorithm for mining frequent itemsets based on set and bit operation is proposed. In this algorithm, digital view is used to express the transaction who used each item, and bit operating is used in digital view to calculate the number of support of each itemset. The problem of repeatedly scanning the database in Apriori algorithm is solved and operating efficiency is improved in the new algorithm.

作者谢清森杨天奇

机构地区暨南大学

出处《科学技术与工程》 2009年第23期7176-7179,共4页 Science Technology and Engineering

基金广东省自然科学基金项目(5006102)资助

关键词 WEB挖掘 PREFIXSPAN算法序列模式 Web mining PrefixSpan algorithm sequence pattern

分类号 TP391.3 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1KaushikA.精通Web analytics.北京:清华大学出版社,2008.
2Han Jiawei, Kamber M. Data mining concepts and Techniques.北京:机械工业出版社,2007.
3Pei Jian, Han Jiawei, Mortazavi-Asl B, et al. PrefixSpan:mining sequential atterns efficiently by prefix-projected pattern growth. Proc of 2001 Int'l Conf. on Data Engineering. :IEEE Press,2001:215--224.
4Ding Bolin, DavidL O, Han Jiawei, et al. Efficient mining of closed repetitive gapped subsequetlees from a sequence database. Department of Cmnputer Science, University of Illinois at Urbana-Champaign, 2009.
5Pei J,Han J,Wang W. Mining equential patterns with con-straints in Large databases. Proc of the llth Int Conf on lnforma-tion and Knowledge Management, McLean, Virginia, 2002. New York, AC- MPress : 18--25.

同被引文献20

1车竞.现代汉语比较句论略[J].湖北师范学院学报（哲学社会科学版）,2005,25(3):60-63. 被引量：23
2张坤,朱扬勇.无重复投影数据库扫描的序列模式挖掘算法[J].计算机研究与发展,2007,44(1):126-132. 被引量：17
3AGRAWAL B, SRIKANT It. Mining sequential patterns [C]// ICDE '95: Proceedings of the Eleventh International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 1995:3 - 14.
4SRIKANT R, AGRAWAL R. Mining sequential patterns: generalizations and performance improvements [ C]// EDBT '96: Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology. Berlin: Springer-Verlag, 1996:3 - 17.
5ZAKI M. SPADE: an efficient algorithm for mining frequent sequences [J]. Machine Learning, 2001, 42(1) : 31 -60.
6PEI J, HAN J, PINTO H, et al. PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth [ C]// Proceedings 17th International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 2001:215-224.
7HAN J, PEI J, MORTAZAVI-ASL B, et al. FreeSpan: frequent pattern-projected sequential pattern mining [C]// Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2000:355 -359.
8HANJ KAMBERM 范明孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
9Ganapathibhotla Murthy,Liu Bing. Mining Opinions in Comparative Sentences[C]//Proceedings of the 22nd International Conference on Computational Linguistics, 2008 : 241-248.
10Jindal Nitin,Liu Bing. Identifying Comparative Sentences in Text Doeuments[C]//Proceedings of SIGIR 2006. Washing- ton, USA, 2006 : 244-251.

引证文献2

1公伟,刘培玉,贾娴.基于改进PrefixSpan的序列模式挖掘算法[J].计算机应用,2011,31(9):2405-2407. 被引量：12
2王素格,王凤霞,宋雅.基于序列模式的汉语比较句识别方法[J].山西大学学报（自然科学版）,2013,36(2):172-179. 被引量：1

二级引证文献13

1周晓凤,肖南峰,文翰.基于情感特征分类的语音情感识别研究[J].计算机应用研究,2012,29(10):3648-3650. 被引量：5
2缪裕青,吴孔玲,朱晓雁,张锦杏.基于二级索引结构无候选项闭合序列模式挖掘算法[J].计算机应用研究,2012,29(10):3672-3676.
3李陶深,王伟娜,陈庆峰.Web访问序列模式挖掘算法的研究[J].计算机科学,2013,40(12):41-44. 被引量：2
4张巍,刘峰,滕少华.改进的PrefixSpan算法及其在序列模式挖掘中的应用[J].广东工业大学学报,2013,30(4):49-54. 被引量：11
5付沙.基于序列模式挖掘的图书馆用户借阅行为分析[J].情报理论与实践,2014,37(6):103-106. 被引量：18
6陈勇.一种目标行为序列模式的数据挖掘方法[J].无线电通信技术,2015,41(2):79-81. 被引量：11
7李硕,石丽红,呼忠权,孔涛.序列模式挖掘技术在数字图书馆中的应用[J].农业图书情报学刊,2015,27(7):40-43. 被引量：2
8杨斐,张万桢,陆垂伟.一种无候选项的闭合序列模式挖掘算法[J].计算机应用与软件,2016,33(3):279-283. 被引量：1
9薛飞,单征,闫丽景,范超.基于数据挖掘的多轨迹特征检测技术[J].计算机科学,2016,43(5):91-95. 被引量：2
10王斌,黄晓芳,袁平.基于PrefixSpan序列模式挖掘的改进算法[J].西南科技大学学报,2016,31(4):68-72. 被引量：6

1公伟,刘培玉,贾娴.基于改进PrefixSpan的序列模式挖掘算法[J].计算机应用,2011,31(9):2405-2407. 被引量：12
2叶飞跃.基于自适应哈希链的分布式频繁模式挖掘算法[J].系统工程与电子技术,2005,27(3):560-564. 被引量：2
3胡圣荣.一个拟就地稳定归并排序算法[J].湖南理工学院学报（自然科学版）,2014,27(2):45-49.
4谷雨,郑锦辉,戴明伟,何磊.基于Bagging支持向量机集成的入侵检测研究[J].微电子学与计算机,2005,22(5):17-19. 被引量：6
5缪裕青,吴孔玲,朱晓雁,苏杰.一种基于序列末项位置信息的序列模式挖掘算法[J].计算机应用研究,2012,29(7):2505-2508. 被引量：5
6刘贞,张小真.基于堆栈模型的数据挖掘算法研究[J].西南师范大学学报（自然科学版）,2002,27(3):312-315. 被引量：2
7刘辉,王伯雄,李鹏程,任怀艺.双向扫描投影双目结构光编码设计[J].仪器仪表学报,2012,33(8):1862-1867. 被引量：4
8陈兆学,施鹏飞.基于灰度图像的车牌快速定位和分割方法[J].计算机工程,2006,32(9):173-174. 被引量：16
9汪林林,范军.基于PrefixSpan的序列模式挖掘改进算法[J].计算机工程,2009,35(23):56-58. 被引量：13
10肖仁财,薛安荣.一种挖掘多维序列模式的有效方法[J].计算机工程与应用,2008,44(6):187-190. 被引量：3

科学技术与工程

2009年第23期

浏览历史

内容加载中请稍等...

改进的PrefixSpan算法在Web挖掘中的应用被引量：2

参考文献5

同被引文献20

引证文献2

二级引证文献13

相关作者

相关机构

相关主题

浏览历史

改进的PrefixSpan算法在Web挖掘中的应用 被引量：2

参考文献5

同被引文献20

引证文献2

二级引证文献13

相关作者

相关机构

相关主题

浏览历史

改进的PrefixSpan算法在Web挖掘中的应用被引量：2