摘要
针对仿真生成算法采用静态分布模型生成Web日志,会造成与真实数据之间存在较大差异的问题,提出一种基于用户兴趣迁移的Web日志仿真生成(WLSG)算法。该算法首先对Web日志与时间的关系进行了建模;其次,模拟了用户在不同时间访问文件时用户的兴趣迁移;最后,也模拟了用户自适应访问当前时刻最感兴趣的文件。相对于现有的采用静态分布模型的仿真算法,所提算法能够提高自相似性指标约2.86%。实验结果表明,该算法通过用户的兴趣迁移来改变用户的访问序列,能够较好地模拟真实Web日志,有效地应用于Web日志的仿真生成。
When the existing simulation generation algorithm uses the distribution of the static model to generate a Web log, there is a big difference with real data. In order to solve the problem, a new algorithm of Web Log Simulation Generation based on user interest migration (WLSG) was proposed. Firstly, the relationship between Web log and time was modeled. Secondly, the migration of user interest was simulated when the user accessed to the file in different time. Finally, it was also simulated that the user adaptively access to the file which he was most interested in at the current moment. Compared with the distribution of the existing static model, the proposed algorithm had significantly improved the self-similarity by about 2.86% on average. The experimental results show that, the proposed algorithm can well simulate Web log by user interest in migration to change user access sequence, which is capable of being effectively applied in the Web log simulation generation.
出处
《计算机应用》
CSCD
北大核心
2016年第12期3476-3480,3504,共6页
journal of Computer Applications
基金
福建省高校产学合作项目(2016H6007)~~
关键词
兴趣迁移
时间序列
日志分析
自相似
仿真生成
interest migration
time series
log analysis
self-similarity
simulation generation