摘要
针对大量数据副本所带来的资源管理问题,提出一种基于有限编码的多副本分簇管理方法.在该方法中,根据单副本复制产生新副本的过程对副本分级和分簇,通过定义“副本级别+副本顺序”的编码规则对划分后的副本进行编码和组织,并依据编码规则对由于副本的动态调整(增加或撤消)而引起的簇的动态变化进行有效管理.通过该方法,在大量副本之间建立局域集中、广域对等的管理模式,再结合定义的“最小更新传播时间”可以降低大量副本的一致性维护开销.讨论了方法中编码规则与副本规模之间的关系,以及副本失效和恢复时的解决方法.性能测试结果表明,该方法能够有效组织大规模的数据副本,具有较好的可扩展性,对适度的结点失效不敏感,适合更新频繁的应用.
In this paper, according to the resource management problems brought by a large number of replicas, a multi-replica clustering management method based on limited-coding is proposed. In this method, according to the process of creating new replicas from existent single replica, replicas are partitioned into different hierarchies and clusters. Then replicas are coded and managed based on the user-defined limited-coding rule consisting of replica hierarchy and replica sequence, which can also dispose the alteration of clusters caused by dynamic adjustments on replicas (replica addition or replica removal) effectively. After that, a management model of centralization in local and peer to peer in wide area is adopted to organize replicas, and the cost of reconciling consistency can be greatly depressed combining with defined minimal-time of update propagation. The relevance between the coding rule and the number of replicas, and the solutions to replica failure and replica recover are discussed. The results of the performance evaluation show that the clustering method is an efficient way to manage a large number of replicas, achieving good scalability, not sensitive to moderate node failure, and adapting well to applications with frequent updates.
出处
《软件学报》
EI
CSCD
北大核心
2007年第6期1456-1467,共12页
Journal of Software
基金
国家自然科学基金No.69903011
国家重点基础研究发展计划(973)No.2002CB312105
高等学校全国优秀博士学位论文作者专项资金No.200141~~
关键词
数据复制
P2P分布存储系统
分簇
数据一致性
data replication
peer-to-peer distributed storage system
clustering
data consistency