摘要
日志结构合并树(LSM-tree)因能利用外存设备的顺序访问性能,被广泛应用于键值存储系统的核心数据结构.由于LSM-tree层次化的、有序的数据组织结构需要通过大量的数据合并操作维护,故引起了严重的写放大效应.最近的研究工作提出了若干优化方案缓解LSM-tree的写放大,但是牺牲了查询性能和空间利用率.为此基于LSM-tree的键值存储系统提出一种新的架构,其核心设计是采用键值分离的方案降低数据合并开销,并以一种新型的树状结构vTree为值维护一定程度有序性,保障高效的范围查询;同时为vTree设计了相应的数据合并和空间回收方法.实验结果表明,基于该架构实现的键值存储系统在写入、点查询、范围查询各方面有均衡的高性能表现,且空间开销较低.
Log-structured merge tree(LSM-tree)is widely used as the core data structure of modern key-value stores due to its ability to utilize sequential access performance of external storage.However,it suffers from expensive merge operations to maintain the layered and ordered data organization,which induces significant write amplification.Recent researches have proposed various optimizations to mitigate write amplification,but at the cost of query performance and space utilization.A new architecture for LSM-tree based key-value store is proposed.The key idea is to leverage the key-value separation design to mitigate merge overhead,while maintaining a certain degree of order for values by using a new tree structure called vTree to improve the range query performance.In the meantime,corresponding data merge and space reclaim algorithms were developed for the vTree.The experimental results show that the key-value store developed with this architecture has well balanced performance on write,point lookup,and range query,with low space cost.
作者
吴加禹
李永坤
许胤龙
WU Jiayu;LI Yongkun;XU Yinlong(School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026, China)
基金
国家自然科学基金(61772484)
统筹推进世界一流大学和一流学科建设专项资金(YD2150002003)资助.
关键词
日志结构合并树
写放大
键值分离
范围查询
log-structured merge tree
write amplification
key-value separation
range query