摘要
传统基于LRU的磁盘缓存并不能识别缓存中内容相同的重复数据块,使得缓存中存在一定的冗余数据,同时传统磁盘缓存都是基于固定的页面大小,而页面大小也是影响缓存命中率的重要因素,最佳的页面大小能够最大化缓存命中率.本文提出一种基于混合页面的磁盘缓存去重策略.在磁盘缓存中引入混合页机制,保留基页的同时,增加巨页,并自适应调整巨页的大小以使命中率最大化;同时监测基页、巨页的冷热程度,将重复率高的冷巨页拆分为基页或将拆分后的热基页重构为巨页,实现基页、巨页的动态转换;利用重删技术对基页、巨页分别进行去重处理,使命中率最大化的同时保持去重率.通过对真实trace数据进行模拟实验,测试结果表明,与传统磁盘缓存相比该策略能够显著提高磁盘缓存的命中率,最高可达30. 08倍,同时能节省磁盘访问时间最高达31. 72%.
The traditional disk cache based on LRU can’t recognize the duplicate data blocks,which makes redundant data exist in the cache. At the same time,the traditional disk cache is based on a fixed page size,and the page size is also an important factor affecting the cache hit ratio. The optimal page size can maximize the cache hit ratio. In this work,we propose a strategy of disk cache deduplication based on mixed pages. Mixed pages mechanism is introduced in the disk cache to keep base pages and huge pages,then the size of huge pages are adjusted adaptively to maximize the hit ratio. At the same time,the access frequency of pages are monitored,and the cold huge pages with high repetition rate are split or reconstruct split huge pages when they become hot to realize the dynamic conversion between huge pages and base pages. Our objective is to maximize the hit ratio while keeping high deduplication rate. By simulating the real trace data,the experiment results show that compared with the traditional disk cache,this strategy can improve the disk cache hit ratio up to 30. 08x and save disk access time up to 31. 72%.
作者
斯雷
邓玉辉
SI Lei;DENG Yu-hui(Department of Computer Science,Jinan University,Guangzhou 510632,China;State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2019年第9期2000-2006,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61572232)资助
广州市科技计划项目(201802010028,201802010058)资助
中国科学院计算机系统结构国家重点实验室开放基金项目(CARCH201705)资助
关键词
磁盘缓存
混合页
基页
巨页
重复数据删除
disk cache
mixed pages
base page
huge page
data deduplication