摘要
为了缓解存储系统中因为重复数据索引而引起的存储设备访问过于频繁的问题,深入研究重复数据删除技术,并针对目前重复数据删除技术中Bloom Filter的运用以及存在的存储设备访问性能问题进行分析和研究,提出一种基于Bloom Filter的高效去重优化模式。针对单一Bloom Filter固有的假阳性的缺陷,增加辅助Bloom Filter,从而减小误判率,达到减少存储设备访问次数的目的;针对因系统软件错误引起的Bloom Filter假阴性错误,引入单校验位的错误校验机制可以实现避免假阴性值存储的同时又能减小内存存储开销。仿真实验结果表明:改进方法能够兼顾Bloom Filter的误判率与存储设备访问开销问题。通过引入一种判断机制配合辅助Bloom Filter和单校验位机制,能够达到误判率降低、存储设备访问开销减小的高性能优化效果。
In order to alleviate the problem of the frequent access to storage device which caused by the indexes using in data deduplica- tion, data deduplication is deeply studied, making analysis and research on the application of Bloom Filter at the present situation of data deduplication and existing problems of the access of storage system performance and proposing a high-efficiency and optimal model based on Bloom Filter. Aiming at the situation that the probability of false positives is in the nature of Bloom Filters, an additional Bloom Filter is used to reduce false positive rate, achieving the purpose of reducing times of the access for storage system. In view of the situation that the system software errors may bring Bloom Filter false negative,single bit error checking mechanism is introduced to prevent it from happening,at the same time,it can reduce memory overhead. The simulation shows that the proposed method can balance the false posi- tive rate and the access of storage system costs. By introducing a judgment mechanism with complement Bloom Filter and single bit error checking mechanism, it can achieve the effects of the reducing of false positive rate and the access of storage system costs.
出处
《计算机技术与发展》
2016年第8期182-186,190,共6页
Computer Technology and Development
基金
国家自然科学基金资助项目(11501302)