期刊文献+

基于SPAM-FPT的WebLog访问序列模式挖掘

WebLog Access Sequential Pattern Mining Based on SPAM-FTP
下载PDF
导出
摘要 WebLog访问序列模式挖掘将数据挖掘中的序列模式技术应用于Web服务器上的日志文件,以此来改善Web的信息服务,而在对海量的数据挖掘时,系统资源开销很大。该文结合SPAM、PrefixSpan的思想,提出一个新的算法——SPAM-FPT,该算法通过建立First_Positon_Table,避免了SPAM中的"与操作"、"连接操作"以及PrefixSpan中大量的"投影数据库"的建立,可以快捷地挖掘数据库中所有"频繁子序列"。 WebLog mining is application of sequential pattern mining of data mining technology on Web server log files. Sequential patterns mined from Web logs are used to improve the quality of information service on Web. The main challenge of mining access sequential pattern form WebLog is the high processing cost due to the large amount of data. By combining SPAM and PrefixSpan, this paper proposes a new arithmetic SPAM-FPT. By constructing first_positon_table, SPAM-fPT avoids "joining" or "ANDing" in SPAM and generating a large number of projected database in PrefixSpan, and gets all the frequent sequential patterns form dtatabase.
出处 《计算机工程》 CAS CSCD 北大核心 2007年第17期80-82,共3页 Computer Engineering
关键词 序列模式挖掘 WEBLOG 频繁子序列 SPAM—FPT sequential pattern mining WebLog frequent subsequence SPAM-FPT
  • 相关文献

参考文献5

  • 1Srikantr A R.Mining Sequential Patterns[C]//Proc.of International Conference on Data Engineering.1995:3-14.
  • 2Han J W Kamber M 范明 孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版杜,2001.147-158.
  • 3Ayres J,Flannick J,Gehrke J,et al.Sequential Pattern Mining Using a Bitmap Representation[C]//Proc.of ACM Special Interest Group Conference on Knowledge Discovery and Data Mining.2002:429.
  • 4Srikantr A R.Fast Algorithms for Mining Association Rules[C]//Proc.of International Conference on Very Large Data Bases,Santiago.1994:487-499.
  • 5Chiu D,Wu Y,Chen A P L.An Efficient Algorithm for Mining Frequent Sequences by a New Strategy Without Support Counting[C]//Proc.of the 20th Int'l Conf.on Data Engineering.2004.

共引文献112

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部