摘要
为了挖掘出更能如实反映Web用户兴趣偏好的使用模式,充分考虑了用户在页面的停留时间和点击次数两个因素,给出了频繁偏爱路径的定义,并提出了频繁偏爱路径的挖掘算法,该方法在求得最大向前路径的基础上,迭代产生更长的候选频繁偏爱路径,通过计算候选路径的频繁偏爱支持度来判断其是否为频繁偏爱路径。利用真实日志数据进行实验,实验结果表明,该算法具有较高的覆盖率和准确性。
To mine usage patterns that reflect web user’s interests more accurately,two factors such as the time spent on the pages and the times the pages were clicked are fully considered.The definition of frequent and preferred path is given,and also the method to mine the paths is proposed.On the basis of obtained maximum forward paths,longer frequent and preferred paths as candidates are iterative generated,and the candidates will be taken as frequent and preferred paths or not by calculate their frequent and preferred support.After the application in actual web log data is studied,it’s proved that the algorithm has higher coverage and accuracy.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第24期5615-5617,5621,共4页
Computer Engineering and Design
关键词
数据挖掘
用户会话
频繁偏爱支持度
最大向前路径
频繁偏爱路径
data mining
user session
frequent and preferred support
maximum forward path
frequent and preferred path