摘要
挖掘极大频繁页面集是WEB使用挖掘中的关键应用之一。由于一定时间段的会话中蕴含着用户的访问模式与访问动机,设计一种结点带有驻留时间,类似FP-tree的频繁页面树FPDT-tree结构;利用FPDT-tree结构存储双向驻留时间约束的会话数据库,简化挖掘过程中驻留时间阈值的设置。基于FPDT-tree提出算法MFPSM挖掘会话中的极大频繁页面集,实验结果表明,在时间约束环境中,通过决策者给出合适的时间约束阈值,该算法可以有效地缩短挖掘极大频繁页面集的时间。
Mining maximum frequent page set is a key to web usage mining applications,since user's traversal patterns and motivation are latent in sessions at some time segment.According to FP-tree,a Frequent Page Tree structure with Dwell Time(abbreviated as FPDT-tree) is designed.Utilizing FPDT-tree to store,compress the session database that is constrained by the bi-directional dwell time,and simplify the configuration of dwell time thresholds in mining.A new algorithm called Maximum Frequent Page Set Mining is presented,which traversals quickly FPDT-tree and discovers maximum frequent page set from the sessions.In the time constraint environment,experimental results show that MFPSM can significantly improve the execution time as long as the decision-maker's(user) gives the appropriate dwell time constraints.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第4期190-193,共4页
Computer Engineering and Applications
基金
河北省教育厅博士基金资助项目(B200322)。