摘要
针对基于Web日志挖掘的用户访问预测经典算法的不足,提出了基于Markov链和关联规则的预测算法(MAPA).使用二阶Markov链找到用户下一步或将来可能访问的页面集,生成预测候选集;使用二项关联规则从正向和反向2个角度修正Markov的预测结果,从而生成最后的预测页面.通过引入用户反馈机制,提出了带反馈的Markov预测算法(MPAF),即在预测过程中逐步构造历史预测树,把历史预测信息保存到历史预测树中,并根据用户的反馈来判断预测的正确性.在预测过程中,用二阶Markov预测算法生成预测候选集,再利用历史预测信息动态地调整预测算法,从而生成预测页面.理论分析证明,这2种预测算法具有线性时间复杂度的预测效率.实验结果表明,MAPA和MPAF在预测准确率上平均提高5%和10%.
A Markov chain and association rule prediction algorithm (MAPA) is proposed to deal with shortcomings of existing algorithms on user access prediction based on web log mining. The algorithm uses the second-order Markov chain to find the pages which users may visit in either the next step or future, so as to generate the candidate prediction page set. Then the two-item association rules are used to correct the prediction result from the forward and the reverse perspectives to get the last prediction page. The algorithm integrates the advantages of both the Markov chain and the association rule well. A Markov prediction algorithm with feedback (MPAF) is proposed by introducing user feedback mechanism. The algorithm creates a history prediction tree (HPT) step by step during the prediction process, saves the history prediction information into HPT, and determines whether the prediction is correct according the user's feedback. The algorithm generates the candidate prediction page set using the second order Markov prediction algorithm at first, and then the last prediction page is generated by dynamically adjusting the prediction algorithm according the historical prediction information. Theoretical analyses show that both the prediction algorithms have linear time complexity. Experimental results show that the average prediction accuracy of MAPA and MPAF is increased by 5% and 10%, respectively.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2010年第4期28-33,共6页
Journal of Xi'an Jiaotong University
基金
国家自然科学基金资助项目(50604012)