期刊文献+

改进的PrefixSpan算法在Web挖掘中的应用 被引量:2

Application of Improved PrefixSpan Algorithm in Web Mining
下载PDF
导出
摘要 针对PrefixSpan算法不足,采用修改Prefix策略与舍弃非频繁项的方法,减少内存与外存之间频繁地交换,减小在挖掘过程中产生的投影数据库规模,降低构建、扫描投影数据库的时空耗费,从而改进算法。实验结果表明,在长序列模式挖掘中,算法在改进后运行效率比原来提高35%以上,更适用于Web挖掘。 Generating frequent itemsets is a critical step in association rule mining. Through the analysis of Apriori algorithm, a new algorithm for mining frequent itemsets based on set and bit operation is proposed. In this algorithm, digital view is used to express the transaction who used each item, and bit operating is used in digital view to calculate the number of support of each itemset. The problem of repeatedly scanning the database in Apriori algorithm is solved and operating efficiency is improved in the new algorithm.
机构地区 暨南大学
出处 《科学技术与工程》 2009年第23期7176-7179,共4页 Science Technology and Engineering
基金 广东省自然科学基金项目(5006102)资助
关键词 WEB挖掘 PREFIXSPAN算法 序列模式 Web mining PrefixSpan algorithm sequence pattern
  • 相关文献

参考文献5

  • 1KaushikA.精通Web analytics.北京:清华大学出版社,2008.
  • 2Han Jiawei, Kamber M. Data mining concepts and Techniques.北京:机械工业出版社,2007.
  • 3Pei Jian, Han Jiawei, Mortazavi-Asl B, et al. PrefixSpan:mining sequential atterns efficiently by prefix-projected pattern growth. Proc of 2001 Int'l Conf. on Data Engineering. :IEEE Press,2001:215--224.
  • 4Ding Bolin, DavidL O, Han Jiawei, et al. Efficient mining of closed repetitive gapped subsequetlees from a sequence database. Department of Cmnputer Science, University of Illinois at Urbana-Champaign, 2009.
  • 5Pei J,Han J,Wang W. Mining equential patterns with con-straints in Large databases. Proc of the llth Int Conf on lnforma-tion and Knowledge Management, McLean, Virginia, 2002. New York, AC- MPress : 18--25.

同被引文献20

  • 1车竞.现代汉语比较句论略[J].湖北师范学院学报(哲学社会科学版),2005,25(3):60-63. 被引量:23
  • 2张坤,朱扬勇.无重复投影数据库扫描的序列模式挖掘算法[J].计算机研究与发展,2007,44(1):126-132. 被引量:17
  • 3AGRAWAL B, SRIKANT It. Mining sequential patterns [C]// ICDE '95: Proceedings of the Eleventh International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 1995:3 - 14.
  • 4SRIKANT R, AGRAWAL R. Mining sequential patterns: generalizations and performance improvements [ C]// EDBT '96: Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology. Berlin: Springer-Verlag, 1996:3 - 17.
  • 5ZAKI M. SPADE: an efficient algorithm for mining frequent sequences [J]. Machine Learning, 2001, 42(1) : 31 -60.
  • 6PEI J, HAN J, PINTO H, et al. PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth [ C]// Proceedings 17th International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 2001:215-224.
  • 7HAN J, PEI J, MORTAZAVI-ASL B, et al. FreeSpan: frequent pattern-projected sequential pattern mining [C]// Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2000:355 -359.
  • 8HANJ KAMBERM 范明 孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 9Ganapathibhotla Murthy,Liu Bing. Mining Opinions in Comparative Sentences[C]//Proceedings of the 22nd International Conference on Computational Linguistics, 2008 : 241-248.
  • 10Jindal Nitin,Liu Bing. Identifying Comparative Sentences in Text Doeuments[C]//Proceedings of SIGIR 2006. Washing- ton, USA, 2006 : 244-251.

引证文献2

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部