摘要
近年来,由于数据规模的急剧增长,越来越多的大型应用系统被部署到分布式环境中,它们需要通过数据分片技术,将原有数据集和新增加的数据审慎地划分到不同的节点上,来优化并行联机事务处理(on-line transaction processing,OLTP)系统的性能。针对系统中已有的静态数据和新生成的增量数据,提出了一种新的数据分片策略——数据表依赖分片策略(table dependency partitioning strategy,TDPS)。该策略首先根据数据表之间的相互依赖关系,对初始数据进行划分。当有新的数据到达时,它会自动将每个数据片段分配到最相关的数据分区中。使用TPC-C测试基准进行了一系列的实验,实验结果显示,与以前的方法相比,TDPS策略可以有效地提高系统性能。
Nowadays, more and more applications have to be deployed in a distributed environment in order to handle huge volume of data, which need to use data partitioning to optimize the performance of parallel OLTP (on-line transaction processing) systems via carefully dividing the original data and newly appended data into different data nodes. This paper presents a novel data partitioning strategy for allocating both static and appended data, called TDPS (table dependency partitioning strategy). This strategy firstly partitions the initial data based on table dependency. When there are new data arriving, it will assign each data fragment to the partition most close to it. This paper conducts a series of experiments over TPC-C datasets and transactions. According to the results, the proposed strategy can effectively improve the system performance compared with previous methods.
出处
《计算机科学与探索》
CSCD
2013年第9期800-810,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.61003086
软件开发环境国家重点实验室开放基金No.SKLSDE-2012KF-09
中国人民大学研究生科研基金No.42306176~~