随着内存密集型应用的快速发展,应用对单机内存容量的需求日益增大.然而,受到颗粒密度的限制,内存容量的扩展度较低.页交换机制是进行内存扩展的经典技术,该机制通过将较少使用的内存页面暂存在存储设备,以达到扩展内存的目的.过去页交...随着内存密集型应用的快速发展,应用对单机内存容量的需求日益增大.然而,受到颗粒密度的限制,内存容量的扩展度较低.页交换机制是进行内存扩展的经典技术,该机制通过将较少使用的内存页面暂存在存储设备,以达到扩展内存的目的.过去页交换机制由于慢速磁盘的读写速度限制,无法被广泛应用.近年来,得益于超低延迟固态硬盘(solid state drive,SSD)的快速发展,页交换机制可以利用其低延迟的读写特性,提升页交换效率.然而,在低I/O延迟的情况下,传统页交换机制的I/O栈存在巨大的软件开销.首先对使用超低延迟SSD的Linux页交换机制进行测试与分析,发现现有页交换机制的主要瓶颈在于发送请求时存在队头阻塞问题、I/O合并和调度开销,以及内核返回路径上的中断处理和直接内存回收开销.基于分析结果,提出基于超低延迟SSD的页交换机制Ultraswap.Ultraswap在Linux I/O栈的基础上增加对轮询请求的处理,并降低I/O合并与调度开销,实现轻量级的I/O栈.基于Ultraswap的I/O栈,对内核页交换机制的换入与换出路径进一步优化.通过优化对缺页、直接内存回收的处理,降低页交换机制关键路径上的时间开销.实验结果表明Ultraswap在应用测试场景下相比Linux页交换机制能够提升19%的平均性能;在可使用内存比例为20%的情况下,Ultraswap可达到33%的性能提升.展开更多
NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前...NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前的硬件发展新趋势,如多核处理器、大内存和低延迟闪存、非易失性内存NVM(Non-Volatile Memory)等,难以充分发挥新硬件的优势,如数据索引、并发控制、事务日志管理等技术在多核架构下存在多核扩展性问题,又如数据存储策略不适应闪存SSD(Solid State Drive)的新存储特性而产生了IO利用率低效的问题.针对多核处理器、大内存和闪存、NVM等硬件发展新趋势,文中面向当前的大数据应用背景,综述了KV型本地存储系统在索引技术、并发控制、事务日志管理和数据放置等核心模块上的最新优化技术和系统研究成果.从处理器、内存和持久化存储的角度概括了KV型本地存储系统当前存在的最优技术,总结了当前研究尚未解决的技术挑战,并对KV型本地存储系统在CPU缓存高效性、事务日志扩展性和高可用性等方面的研究进行了展望.展开更多
作为SSD(solid state drives)的存储元件,NAND闪存在进行写之前,存储单元必须先进行擦除,因此被称作写一次存储器。SSD的使用寿命受到存储单元的擦除次数的限制,因此减少擦除次数对于SSD的可靠性十分重要。提出了一种通过编码压缩后的...作为SSD(solid state drives)的存储元件,NAND闪存在进行写之前,存储单元必须先进行擦除,因此被称作写一次存储器。SSD的使用寿命受到存储单元的擦除次数的限制,因此减少擦除次数对于SSD的可靠性十分重要。提出了一种通过编码压缩后的差值信息的方法来对SSD中写过一次的页面进行二次写,从而减少SSD的擦除次数,延长使用寿命。首先计算物理页面中更新前后的数据的差值,然后将差值数据进行压缩,再将压缩后的数据进行编码后保存在写过的物理页中的可写位中,以此实现写过物理页的二次写。实验结果表明,对于数据更新为主的应用,该方法能够充分利用写过的物理页中的可写位,大幅减少SSD的擦除次数。展开更多
Non-volatile memories(NVMs)provide lower latency and higher bandwidth than block devices.Besides,NVMs are byte-addressable and provide persistence that can be used as memory-level storage devices(non-volatile main mem...Non-volatile memories(NVMs)provide lower latency and higher bandwidth than block devices.Besides,NVMs are byte-addressable and provide persistence that can be used as memory-level storage devices(non-volatile main memory,NVMM).These features change storage hierarchy and allow CPU to access persistent data using load/store instructions.Thus,we can directly build a file system on NVMM.However,traditional file systems are designed based on slow block devices.They use a deep and complex software stack to optimize file system performance.This design results in software overhead being the dominant factor affecting NVMM file systems.Besides,scalability,crash consistency,data protection,and cross-media storage should be reconsidered in NVMM file systems.We survey existing work on optimizing NVMM file systems.First,we analyze the problems when directly using traditional file systems on NVMM,including heavy software overhead,limited scalability,inappropriate consistency guarantee techniques,etc.Second,we summarize the technique of 30 typical NVMM file systems and analyze their advantages and disadvantages.Finally,we provide a few suggestions for designing a high-performance NVMM file system based on real hardware Optane DC persistent memory module.Specifically,we suggest applying various techniques to reduce software overheads,improving the scalability of virtual file system(VFS),adopting highly-concurrent data structures(e.g.,lock and index),using memory protection keys(MPK)for data protection,and carefully designing data placement/migration for cross-media file system.展开更多
Persistent memory(PM)promises byte-addressability,large capacity,and durability.Main memory systems,such as key-value stores and in-memory databases,benefit from such features of PM.Due to the great popularity of hash...Persistent memory(PM)promises byte-addressability,large capacity,and durability.Main memory systems,such as key-value stores and in-memory databases,benefit from such features of PM.Due to the great popularity of hash-ing index in main memory systems,a number of research efforts are made to provide high average performance persistent hashing.However,suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing.In this paper,we analyze major sources of suboptimal tail performance from key design issues of persis-tent hashing.We identify the global hash structure and concurrency control as remaining explorable design spaces for im-proving tail performance.We propose Directory-sharing Multi-level Extendible Hashing(Dalea)for PM.Dalea designs an-cestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources(rehashing and locking)affecting tail performance.The evaluation results show that,compared with state-of-the-art persistent hashing Dash,Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x.Moreover,in order to provide de-sign guidelines for improving tail performance,we adopt Dalea as a testbed to identify different impacts of four factors on tail performance,including fine-grained rehashing,transient locking,memory pre-allocation,and fingerprinting.展开更多
As the scaling of applications increases, the demand of main memory capacity increases in order to serve large working set. It is difficult for DRAM (dynamic random access memory) based memory system to satisfy the ...As the scaling of applications increases, the demand of main memory capacity increases in order to serve large working set. It is difficult for DRAM (dynamic random access memory) based memory system to satisfy the memory capacity requirement due to its limited scalability and high energy consumption. Compared to DRAM, PCM (phase change memory) has better scalability, lower energy leakage, and non-volatility. PCM memory systems have become a hot topic of academic and industrial research. However, PCM technology has the following three drawbacks: long write latency, limited write endurance, and high write energy, which raises challenges to its adoption in practice. This paper surveys architectural research work to optimize PCM memory systems. First, this paper introduces the background of PCM. Then, it surveys research efforts on PCM memory systems in performance optimization, lifetime improving, and energy saving in detail, respectively. This paper also compares and summarizes these techniques from multiple dimensions. Finally, it concludes these optimization techniques and discusses possible research directions of PCM memory systems in future.展开更多
文摘随着内存密集型应用的快速发展,应用对单机内存容量的需求日益增大.然而,受到颗粒密度的限制,内存容量的扩展度较低.页交换机制是进行内存扩展的经典技术,该机制通过将较少使用的内存页面暂存在存储设备,以达到扩展内存的目的.过去页交换机制由于慢速磁盘的读写速度限制,无法被广泛应用.近年来,得益于超低延迟固态硬盘(solid state drive,SSD)的快速发展,页交换机制可以利用其低延迟的读写特性,提升页交换效率.然而,在低I/O延迟的情况下,传统页交换机制的I/O栈存在巨大的软件开销.首先对使用超低延迟SSD的Linux页交换机制进行测试与分析,发现现有页交换机制的主要瓶颈在于发送请求时存在队头阻塞问题、I/O合并和调度开销,以及内核返回路径上的中断处理和直接内存回收开销.基于分析结果,提出基于超低延迟SSD的页交换机制Ultraswap.Ultraswap在Linux I/O栈的基础上增加对轮询请求的处理,并降低I/O合并与调度开销,实现轻量级的I/O栈.基于Ultraswap的I/O栈,对内核页交换机制的换入与换出路径进一步优化.通过优化对缺页、直接内存回收的处理,降低页交换机制关键路径上的时间开销.实验结果表明Ultraswap在应用测试场景下相比Linux页交换机制能够提升19%的平均性能;在可使用内存比例为20%的情况下,Ultraswap可达到33%的性能提升.
文摘NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前的硬件发展新趋势,如多核处理器、大内存和低延迟闪存、非易失性内存NVM(Non-Volatile Memory)等,难以充分发挥新硬件的优势,如数据索引、并发控制、事务日志管理等技术在多核架构下存在多核扩展性问题,又如数据存储策略不适应闪存SSD(Solid State Drive)的新存储特性而产生了IO利用率低效的问题.针对多核处理器、大内存和闪存、NVM等硬件发展新趋势,文中面向当前的大数据应用背景,综述了KV型本地存储系统在索引技术、并发控制、事务日志管理和数据放置等核心模块上的最新优化技术和系统研究成果.从处理器、内存和持久化存储的角度概括了KV型本地存储系统当前存在的最优技术,总结了当前研究尚未解决的技术挑战,并对KV型本地存储系统在CPU缓存高效性、事务日志扩展性和高可用性等方面的研究进行了展望.
文摘作为SSD(solid state drives)的存储元件,NAND闪存在进行写之前,存储单元必须先进行擦除,因此被称作写一次存储器。SSD的使用寿命受到存储单元的擦除次数的限制,因此减少擦除次数对于SSD的可靠性十分重要。提出了一种通过编码压缩后的差值信息的方法来对SSD中写过一次的页面进行二次写,从而减少SSD的擦除次数,延长使用寿命。首先计算物理页面中更新前后的数据的差值,然后将差值数据进行压缩,再将压缩后的数据进行编码后保存在写过的物理页中的可写位中,以此实现写过物理页的二次写。实验结果表明,对于数据更新为主的应用,该方法能够充分利用写过的物理页中的可写位,大幅减少SSD的擦除次数。
基金supported by the Major Research Plan of the National Natural Science Foundation of China under Grant No.92270202the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No.XDB44030200.
文摘Non-volatile memories(NVMs)provide lower latency and higher bandwidth than block devices.Besides,NVMs are byte-addressable and provide persistence that can be used as memory-level storage devices(non-volatile main memory,NVMM).These features change storage hierarchy and allow CPU to access persistent data using load/store instructions.Thus,we can directly build a file system on NVMM.However,traditional file systems are designed based on slow block devices.They use a deep and complex software stack to optimize file system performance.This design results in software overhead being the dominant factor affecting NVMM file systems.Besides,scalability,crash consistency,data protection,and cross-media storage should be reconsidered in NVMM file systems.We survey existing work on optimizing NVMM file systems.First,we analyze the problems when directly using traditional file systems on NVMM,including heavy software overhead,limited scalability,inappropriate consistency guarantee techniques,etc.Second,we summarize the technique of 30 typical NVMM file systems and analyze their advantages and disadvantages.Finally,we provide a few suggestions for designing a high-performance NVMM file system based on real hardware Optane DC persistent memory module.Specifically,we suggest applying various techniques to reduce software overheads,improving the scalability of virtual file system(VFS),adopting highly-concurrent data structures(e.g.,lock and index),using memory protection keys(MPK)for data protection,and carefully designing data placement/migration for cross-media file system.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No.XDB44030200the Beijing Natural Science Foundation-Haidian Joint Fund for Original Innovation under Grant No.L192038.
文摘Persistent memory(PM)promises byte-addressability,large capacity,and durability.Main memory systems,such as key-value stores and in-memory databases,benefit from such features of PM.Due to the great popularity of hash-ing index in main memory systems,a number of research efforts are made to provide high average performance persistent hashing.However,suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing.In this paper,we analyze major sources of suboptimal tail performance from key design issues of persis-tent hashing.We identify the global hash structure and concurrency control as remaining explorable design spaces for im-proving tail performance.We propose Directory-sharing Multi-level Extendible Hashing(Dalea)for PM.Dalea designs an-cestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources(rehashing and locking)affecting tail performance.The evaluation results show that,compared with state-of-the-art persistent hashing Dash,Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x.Moreover,in order to provide de-sign guidelines for improving tail performance,we adopt Dalea as a testbed to identify different impacts of four factors on tail performance,including fine-grained rehashing,transient locking,memory pre-allocation,and fingerprinting.
基金This work was supported by the National Basic Research 973 Program of China under Grant No. 2011CB302502, the National Natural Science Foundation of China under Grant No. 61379042, Huawei Research Program under Grant No. YB2013090048, and the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No. XDA06010401.
文摘As the scaling of applications increases, the demand of main memory capacity increases in order to serve large working set. It is difficult for DRAM (dynamic random access memory) based memory system to satisfy the memory capacity requirement due to its limited scalability and high energy consumption. Compared to DRAM, PCM (phase change memory) has better scalability, lower energy leakage, and non-volatility. PCM memory systems have become a hot topic of academic and industrial research. However, PCM technology has the following three drawbacks: long write latency, limited write endurance, and high write energy, which raises challenges to its adoption in practice. This paper surveys architectural research work to optimize PCM memory systems. First, this paper introduces the background of PCM. Then, it surveys research efforts on PCM memory systems in performance optimization, lifetime improving, and energy saving in detail, respectively. This paper also compares and summarizes these techniques from multiple dimensions. Finally, it concludes these optimization techniques and discusses possible research directions of PCM memory systems in future.