摘要
给出了一种分布式Web日志挖掘模型DWLMS.根据对挖掘过程及算法进行分析,提出了一种基于DWLMS的局部频繁路径的更新算法LFP和全局频繁路径的更新算法GFP,较好地解决了Web访问信息的异地存储、实时增长、分布式算法通讯量等因素给模式分析过程带来的困难.在实验室对该方法进行了简单实现和实际日志数据的测试,结果表明了算法的有效性.
A distributed Web log mining system model (DWLMS) is presented. Based on the analysis on the procedure and algorithm of Web frequent access pattern mining, the more general incremental updating algorithms of local frequent paths (LFP) and global frequent paths (GFP) in a distributed database system based on DWLMS are proposed for discovering the frequent access paths. Some troubles produced by real time incremental distributed Web access information and more communication data are solved better by the algorithms. The method was realized simply and tested with real world Web log information in laboratory, and the results show that the algorithms are valid.
出处
《北京科技大学学报》
EI
CAS
CSCD
北大核心
2006年第9期896-901,共6页
Journal of University of Science and Technology Beijing
基金
国家自然基金资助项目(No.70431002)
北京电子科技学院重点实验室资助项目