摘要
针对目前主流分布式处理平台数据冗余因子过大、数据可用性不高的问题,提出基于改进RS编码的二次分块数据存储方法。将分布式环境中的文件块分成多个数据子块,利用RS编码对其进行编码,并分布存储到不同机器节点中,以减少数据冗余。实验结果表明,该方法能有效降低数据冗余度,提高数据可用性,减少任务执行时间。
In this paper, a second block storage method is proposed to solve the problems in current distributed processing environments in which data redundancy factor is too high while data availability is low. With the algorithm based on improved Reed-Solomon(RS) coding, the blocks in distributed system can be divided into sub-blocks, then the sub-blocks are encoded and stored in different computers to complete the redundancy of data. Experimental results show that data redundancy and running time are effectively reduced and data availability is increased by this method.
出处
《计算机工程》
CAS
CSCD
2013年第7期83-85,93,共4页
Computer Engineering
关键词
RS编码
分布式处理
二次分块
数据存储
数据可用性
Reed-Solomon(RS) coding
distributed processing
second block
data storage
data availability