Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, ...Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, thus system designers intend to choose weak consistency models, such as eventual consistency, which may result in stale reads. Since stale data items may lead to serious application semantic problems, we consider how to increase the probability of data recency which provides a uniform view on recent versions of data items for all clients. In this work, we propose HARP, a framework that can enhance data recency of eventually con- sistent distributed data stores in an efficient and highly avail- able way. Through detecting possible stale reads under fail- ures or not, HARP can perform reread operations to elim- inate stale results only when needed based on our analysis on write/read processes. We also present solutions on how to deal with some practical anomalies in HARP, including de- layed, reordered and dropped messages and clock drift, and show how to extend HARP to multiple datacenters. Finally we implement HARP based on Cassandra, and the experi- ments show that HARP can effectively eliminate stale reads, with a low overhead (less than 6.9%) compared with original eventually consistent Cassandra.展开更多
基金This work was supported partly by the National High-tech Research and Development Program (863 Program) of China (2015AA01A202), and partly by the National Natural Science Foundation of China (Grant Nos. 61370057 and 61421003).
文摘Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, thus system designers intend to choose weak consistency models, such as eventual consistency, which may result in stale reads. Since stale data items may lead to serious application semantic problems, we consider how to increase the probability of data recency which provides a uniform view on recent versions of data items for all clients. In this work, we propose HARP, a framework that can enhance data recency of eventually con- sistent distributed data stores in an efficient and highly avail- able way. Through detecting possible stale reads under fail- ures or not, HARP can perform reread operations to elim- inate stale results only when needed based on our analysis on write/read processes. We also present solutions on how to deal with some practical anomalies in HARP, including de- layed, reordered and dropped messages and clock drift, and show how to extend HARP to multiple datacenters. Finally we implement HARP based on Cassandra, and the experi- ments show that HARP can effectively eliminate stale reads, with a low overhead (less than 6.9%) compared with original eventually consistent Cassandra.