摘要
分析云存储数据访问的长尾现象,设计一种基于文件相关性的缓存策略MSU(mostsimilarunit).该策略通过判断文件之间的相关性完成大容量缓存中的文件预取与替换.首先,MSU选择文件的多个访问特征作为计算余弦距离值的输入,从而得到文件相关性的度量.然后,MSU将缓存中的文件作为替换待选集合,将一段时间内从缓存中替换出来的文件作为预取待选集合.当出现文件不命中时,从替换待选集合中取得缺失文件的k-非近邻作为替换文件,从预取待选集合中取得缺失文件的1-近邻作为预取文件.仿真实验表明MSU在命中率和字节命中率方面优于LRU(最近最少使用策略)、ARC(自适应替换策略)和GDS(多参数贪心策略)算法.
By considering the long tail distributions of cloud storage data access,a caching strategy named MSU(most similar unit)was designed based on the file correlation.According to the correlation between files,file prefetching and replacement in large-capacity cache could be completed by the strategy.Firstly,some access features were chosen as input for cosine distance by MSU,and the measurement of file correlation was obtained.Then,MSU established two file sets,replacing set that consist of the files in cache and prefetching set that consist of the files which were replaced from cache in a time period.When a file was missed,k-non-nearest neighbor of the missing file from the replacing set was used as replacement file,and 1-nearest neighbor of the missing file from the prefetching set was used as the prefetching file.Results of simulation experiment show that MSU outperforms LRU(least recently used),ARC(adaptive replacement cache)and GDS(greedy dual-size)in hit rate and byte hit rate.
作者
肖芳
周可
XIAO Fang;ZHOU Ke(School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China;Library,Huazhong University of Science and Technology,Wuhan 430074,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2019年第4期1-6,共6页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
关键词
云存储
缓存策略
命中率
文件相关性
文件预取
cloud storage
caching strategy
hit rate
file correlation
file prefetching