期刊文献+

UCM-PPM:基于用户分级的多参量Web预测模型

UCM-PPM:Multi-parameter web prediction model based on the user classification
下载PDF
导出
摘要 Web在过去数十年飞速发展,其低延迟和快响应的特性已经变得越来越重要.面对这样的需求,通常会预取用户即将访问的文件到缓存中,利用代理服务器缓存来获取数据,避免网络堵塞,提高Web访问效率.可见,在预取技术中,一个有效的预测模型是非常有必要的.针对目前缓存预取工作对用户差异关注度不足和度量指标单一化的薄弱环节,提出一个基于用户分级化的Web预测模型,并且能够随着Web请求进行多参数动态调整.该模型通过对代理服务器上用户访问情况分布的变化趋势分析,将用户集分为重要性不同的若干等级,并适当利用序列相似度来聚类低贡献用户产生的会话,之后在部分匹配预测模型的基础上,结合缓存替换策略为预测树结点构造包含多个参量的目标函数,并使构建好的模型能够进行自适应调整.最后通过实验证明该模型可以有效提高缓存的预取性能. With the Web’s rapid development,the demands of low latency and fast response become increasingly urgent over the past few decades.In order to achieve this goal,the prefetching techniques are widely used,where documents are prefetched into caches in advance.Using prefetching techniques,we can avoid network congestion and raise access efficiency.Therefore,an effective prediction model is very essentialin the prefetching technique.Considering the necessities of high accuracy rate and practicability,we use the Prediction by Partial Match(PPM) suffix tree as a fundamental model to predict web pages.We point out some deficiencies on the side of neglect of users’ differences and the metric simplification in current cache-prefetching work.Then we present a multi-parameter web prediction model with a self-adaptation adjustment based on the user hierarchy.The main contents are listed as follows:First,we propose a user classification model based on the history access log in this paper.User behaviors are analyzed and user permutation distribution can be acquired.Then our model classifies users into different categories according to the user contribution degree distribution.The users with different contribution degree account ought to own different weights.In addition,for the users with very low contribution,we align their access web sequences and clusters them.Secondly,a method that sets the node objective function with the multi-parameter effecting is presented to construct the prediction model.The objective function involved with multiple parameters is constructed with elements related to cache replace strategies as the page accessing heat and the user classification accumulation based on the accessing frequency.And we regard the node with maximum value as one owns the strongest predictive ability.We also establish an adjustment mechanism when the prediction tree is working.So the model can learn continuously and adjust dynamically.Finally,we compare our model with several existing models through experiments.Our model has better performance on the prediction accuracy and the cache hit ratio,and we can get better results by adjusting model parameters.
出处 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2018年第1期85-96,共12页 Journal of Nanjing University(Natural Science)
基金 国家自然科学基金(61472070 61672142)
关键词 WEB预取 缓存 用户差别化 多参量 自适应部分匹配预测模型 Web prefetching cache user differentiation multi-parameter self-adaption PPM(Prediction by Partial Match)
  • 相关文献

参考文献1

二级参考文献17

  • 1张志刚,陈静,李晓明.一种HTML网页净化方法[J].情报学报,2004,23(4):387-393. 被引量:57
  • 2P.Denning and S.Schwartz.Properties of the working set model.Communication of the ACM,1972,15(3):191~198
  • 3Lee Breslau,Pei Cao,Li Fan,Graham Phillips,Scott Shenker.Web caching and Zipf-like distributions: evidence and implications.In: Proceedings of IEEE Infocom 99,New York,NY,March,1999.126~134
  • 4Jeffrey Spirn.Distance string models for program behaviour.IEEE Computer,1976,13(11)
  • 5.[EB/OL].北京大学天网WWW搜索引擎.http://e.pku.edu.cn/,.
  • 6R.Mattson,J.Gecsei,D.Slutz and I.Traiger.Evaluation techniques and storage hierarchies.IBM Systems Journal,1970,9: 78~117
  • 7Junghoo Cho,Hector Garcia-Molina.The evolution of the web and implications for an incremental crawler,Page 10,11,1997.In: Proceedings of 26th International Conference on Very Large Databases (VLDB),September 2000
  • 8Junghoo Cho,Hector Garcia-Molina.Estimating frequency of change.ACM Transactions on Internet Technology,2003,3(3)
  • 9Andrei Z.Broder,Marc Najork,Janet L.Wiener.Efficient URL caching for World Wide Web crawling.In: Proceedings of the Twelfth International World Wide Web Conference,Budapest,Hungary,May 2003
  • 10T.Berners-Lee,et.al.Uniform Resource Identifiers(URI): Generic Syntax.RFC 2396,August 1998.http://www.ietf.org/rfc/rfc2396.txt

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部