Non-volatile random-access memory(NVRAM) technology is maturing rapidly and its byte-persistence feature allows the design of new and efficient fault tolerance mechanisms. In this paper we propose the versionized pr...Non-volatile random-access memory(NVRAM) technology is maturing rapidly and its byte-persistence feature allows the design of new and efficient fault tolerance mechanisms. In this paper we propose the versionized process(Ver P), a new process model based on NVRAM that is natively non-volatile and fault tolerant. We introduce an intermediate software layer that allows us to run a process directly on NVRAM and to put all the process states into NVRAM, and then propose a mechanism to versionize all the process data. Each piece of the process data is given a special version number, which increases with the modification of that piece of data. The version number can effectively help us trace the modification of any data and recover it to a consistent state after a system crash.Compared with traditional checkpoint methods, our work can achieve fine-grained fault tolerance at very little cost.展开更多
基金Project supported by the National High-Tech R&D Program(863)of China(Nos.2012AA01A301,2012AA010901,2012AA010303,and 2015AA01A301)the Program for New Century Excellent Talents in University,the National Natural Science Foundation of China(Nos.61272142,61402492,61402486,61379146,and 61272483)+1 种基金the Laboratory Pre-research Fund(No.9140C810106150C81001)the Open Project of the State Key Laboratory of High-End Server&Storage Technology(No.2014HSSA01)
文摘Non-volatile random-access memory(NVRAM) technology is maturing rapidly and its byte-persistence feature allows the design of new and efficient fault tolerance mechanisms. In this paper we propose the versionized process(Ver P), a new process model based on NVRAM that is natively non-volatile and fault tolerant. We introduce an intermediate software layer that allows us to run a process directly on NVRAM and to put all the process states into NVRAM, and then propose a mechanism to versionize all the process data. Each piece of the process data is given a special version number, which increases with the modification of that piece of data. The version number can effectively help us trace the modification of any data and recover it to a consistent state after a system crash.Compared with traditional checkpoint methods, our work can achieve fine-grained fault tolerance at very little cost.