The tail latency of end-user requests,which directly impacts the user experience and the revenue,is highly related to its corresponding numerous accesses in key-value stores.The replica selection algorithm is crucial ...The tail latency of end-user requests,which directly impacts the user experience and the revenue,is highly related to its corresponding numerous accesses in key-value stores.The replica selection algorithm is crucial to cut the tail latency of these key-value accesses.Recently,the C3 algorithm,which creatively piggybacks the queue-size of waiting keys from replica servers for the replica selection at clients,is proposed in NSDI 2015.Although C3 improves the tail latency a lot,it suffers from the timeliness issue on the feedback information,which directly influences the replica selection.In this paper,we analysis the evaluation of queuesize of waiting keys of C3,and some findings of queue-size variation were made.It motivate us to propose the Prediction-Based Replica Selection(PRS)algorithm,which predicts the queue-size at replica servers under the poor timeliness condition,instead of utilizing the exponentially weighted moving average of the state piggybacked queue-size as in C3.Consequently,PRS can obtain more accurate queue-size at clients than C3,and thus outperforms C3 in terms of cutting the tail latency.Simulation results confirm the advantage of PRS over C3.展开更多
The key-value store can provide flexibility of data types because it does not need to specify the data types to be stored in advance and can store any types of data as the value of the key-value pair.Various types of ...The key-value store can provide flexibility of data types because it does not need to specify the data types to be stored in advance and can store any types of data as the value of the key-value pair.Various types of studies have been conducted to improve the performance of the key-value store while maintaining its flexibility.However,the research efforts storing the large-scale values such as multimedia data files(e.g.,images or videos)in the key-value store were limited.In this study,we propose a new key-value store,WR-Store++aiming to store the large-scale values stably.Specifically,it provides a new design of separating data and index by working with the built-in data structure of the Windows operating system and the file system.The utilization of the built-in data structure of the Windows operating system achieves the efficiency of the key-value store and that of the file system extends the limited space of the storage significantly.We also present chunk-based memory management and parallel processing of WR-Store++to further improve its performance in the GET operation.Through the experiments,we show that WR-Store++can store at least 32.74 times larger datasets than the existing baseline key-value store,WR-Store,which has the limitation in storing large-scale data sets.Furthermore,in terms of processing efficiency,we show that WR-Store++outperforms not only WR-Store but also the other state-ofthe-art key-value stores,LevelDB,RocksDB,and BerkeleyDB,for individual key-value operations and mixed workloads.展开更多
Based on a log-structured merge(LSM)tree,the key-value(KV)storage system can provide high reading performance and optimize random writing performance.It is widely used in modern data storage systems like e-commerce,on...Based on a log-structured merge(LSM)tree,the key-value(KV)storage system can provide high reading performance and optimize random writing performance.It is widely used in modern data storage systems like e-commerce,online analytics,and real-time communication.An LSM tree stores new KV data in the memory and flushes to disk in batches.To prevent data loss in memory if there is an unexpected crash,RocksDB appends updating data in the write-ahead log(WAL)before updating the memory.However,synchronous WAL significantly reduces writing performance.In this paper,we present a new WAL mechanism named MyWAL.It directly manages raw devices(or partitions)instead of saving data on a traditional file system.These can avoid useless metadata updating and write data sequentially on disks.Experimental results show that MyWAL can significantly improve the data writing performance of RocksDB compared to the traditional WAL for small KV data on solid-state disks(SSDs),as much as five to eight times faster.On non-volatile memory express soild-state drives(NVMe SSDs)and non-volatile memory(NVM),MyWAL can improve data writing performance by 10%–30%.Furthermore,the results of YCSB(Yahoo!Cloud Serving Benchmark)show that the latency decreased by 50%compared with SpanDB.展开更多
Large-scale key-value stores are widely used in many Web-based systems to store huge amount of data as(key, value) pairs. In order to reduce the latency of accessing such(key, value) pairs, an in-memory cache system i...Large-scale key-value stores are widely used in many Web-based systems to store huge amount of data as(key, value) pairs. In order to reduce the latency of accessing such(key, value) pairs, an in-memory cache system is usually deployed between the front-end Web system and the back-end database system. In practice, a cache system may consist of a number of server nodes, and fault tolerance is a critical feature to maintain the latency Service-Level Agreements(SLAs). In this paper, we present the design, implementation, analysis, and evaluation of R-Memcached, a reliable in-memory key-value cache system that is built on top of the popular Memcached software. R-Memcached exploits coding techniques to achieve reliability, and can tolerate up to two node failures.Our experimental results show that R-Memcached can maintain very good latency and throughput performance even during the period of node failures.展开更多
Many key-value stores use RDMA to optimize the messaging and data transmission between application layer and the storage layer,most of which only provide point-wise operations.Skiplist-based store can support both poi...Many key-value stores use RDMA to optimize the messaging and data transmission between application layer and the storage layer,most of which only provide point-wise operations.Skiplist-based store can support both point operations and range queries,but its CPU-intensive access operations combined with the high-speed network will easily lead to the storage layer reaches CPU bottlenecks.The common solution to this problem is offloading some operations into the application layer and using RDMA bypassing CPU to directly perform remote access,but this method is only used in the hash tablebased store.In this paper,we present RS-store,a skiplist-based key-value store with RDMA,which can overcome the CPU handle of the storage layer by enabling two access modes:local access and remote access.In RS-store,we redesign a novel data structure R-skiplist to save the communication cost in remote access,and implement a latch-free concurrency control mechanism to ensure all the concurrency during two access modes.RS-store also supports client-active range query which can reduce the storage layer’s CPU consumption.At last,we evaluate RS-store on an RDMA-capable cluster.Experimental results show that RS-store achieves up to 2x improvements over RDMA-enabled RocksDB on the throughput and application’s scalability.展开更多
Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used ...Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and RocksDB. Conventional LSM-tree organizes KV items into multiple, successively larger components, and uses compaction to push KV items from one smaller component to another adjacent larger component until the KV items reach the largest component. Unfortunately, current compaction scheme incurs significant write amplification due to repeated KV item reads and writes, and then results in poor throughput. We propose a new compaction scheme, delayed compaction (dCompaction) that decreases write amplification, dCompaction postpones some compactions and gathers them into the following compaction. In this way, it avoids KV item reads and writes during compaction, and consequently improves the throughput of LSM-tree based KV stores. We implement dCompaction on RocksDB, and conduct extensive experiments. Validation using YCSB framework shows that compared with RocksDB, dCompaction has about 40% write performance improvements and also comparable read performance.展开更多
With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Int...With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Internet, the apps replace the PC client software as the major target of malicious usage. In this paper, to improve the security status of current mobile apps, we propose a methodology to evaluate mobile apps based on cloud computing platform and data mining. We also present a prototype system named MobSafe to identify the mobile app's virulence or benignancy. Compared with traditional method, such as permission pattern based method, MobSafe combines the dynamic and static analysis methods to comprehensively evaluate an Android app. In the implementation, we adopt Android Security Evaluation Framework (ASEF) and Static Android Analysis Framework (SAAF), the two representative dynamic and static analysis methods, to evaluate the Android apps and estimate the total time needed to evaluate all the apps stored in one mobile app market. Based on the real trace from a commercial mobile app market called AppChina, we can collect the statistics of the number of active Android apps, the average number apps installed in one Android device, and the expanding ratio of mobile apps. As mobile app market serves as the main line of defence against mobile malwares, our evaluation results show that it is practical to use cloud computing platform and data mining to verify all stored apps routinely to filter out malware apps from mobile app markets. As the future work, MobSafe can extensively use machine learning to conduct automotive forensic analysis of mobile apps based on the generated multifaceted data in this stage.展开更多
文摘The tail latency of end-user requests,which directly impacts the user experience and the revenue,is highly related to its corresponding numerous accesses in key-value stores.The replica selection algorithm is crucial to cut the tail latency of these key-value accesses.Recently,the C3 algorithm,which creatively piggybacks the queue-size of waiting keys from replica servers for the replica selection at clients,is proposed in NSDI 2015.Although C3 improves the tail latency a lot,it suffers from the timeliness issue on the feedback information,which directly influences the replica selection.In this paper,we analysis the evaluation of queuesize of waiting keys of C3,and some findings of queue-size variation were made.It motivate us to propose the Prediction-Based Replica Selection(PRS)algorithm,which predicts the queue-size at replica servers under the poor timeliness condition,instead of utilizing the exponentially weighted moving average of the state piggybacked queue-size as in C3.Consequently,PRS can obtain more accurate queue-size at clients than C3,and thus outperforms C3 in terms of cutting the tail latency.Simulation results confirm the advantage of PRS over C3.
文摘The key-value store can provide flexibility of data types because it does not need to specify the data types to be stored in advance and can store any types of data as the value of the key-value pair.Various types of studies have been conducted to improve the performance of the key-value store while maintaining its flexibility.However,the research efforts storing the large-scale values such as multimedia data files(e.g.,images or videos)in the key-value store were limited.In this study,we propose a new key-value store,WR-Store++aiming to store the large-scale values stably.Specifically,it provides a new design of separating data and index by working with the built-in data structure of the Windows operating system and the file system.The utilization of the built-in data structure of the Windows operating system achieves the efficiency of the key-value store and that of the file system extends the limited space of the storage significantly.We also present chunk-based memory management and parallel processing of WR-Store++to further improve its performance in the GET operation.Through the experiments,we show that WR-Store++can store at least 32.74 times larger datasets than the existing baseline key-value store,WR-Store,which has the limitation in storing large-scale data sets.Furthermore,in terms of processing efficiency,we show that WR-Store++outperforms not only WR-Store but also the other state-ofthe-art key-value stores,LevelDB,RocksDB,and BerkeleyDB,for individual key-value operations and mixed workloads.
基金Project supported by the National Key Research and Development Project of China(No.2022YFB2702101)the Shaanxi Province Key Industrial Projects,China(Nos.2021ZDLGY03-02 and 2021ZDLGY03-08)the National Natural Science Foundation of China(No.92152301)。
文摘Based on a log-structured merge(LSM)tree,the key-value(KV)storage system can provide high reading performance and optimize random writing performance.It is widely used in modern data storage systems like e-commerce,online analytics,and real-time communication.An LSM tree stores new KV data in the memory and flushes to disk in batches.To prevent data loss in memory if there is an unexpected crash,RocksDB appends updating data in the write-ahead log(WAL)before updating the memory.However,synchronous WAL significantly reduces writing performance.In this paper,we present a new WAL mechanism named MyWAL.It directly manages raw devices(or partitions)instead of saving data on a traditional file system.These can avoid useless metadata updating and write data sequentially on disks.Experimental results show that MyWAL can significantly improve the data writing performance of RocksDB compared to the traditional WAL for small KV data on solid-state disks(SSDs),as much as five to eight times faster.On non-volatile memory express soild-state drives(NVMe SSDs)and non-volatile memory(NVM),MyWAL can improve data writing performance by 10%–30%.Furthermore,the results of YCSB(Yahoo!Cloud Serving Benchmark)show that the latency decreased by 50%compared with SpanDB.
基金supported in part by Hong Kong GRF grant HKBU 210412 and HKBU grant FRG2/14-15/059
文摘Large-scale key-value stores are widely used in many Web-based systems to store huge amount of data as(key, value) pairs. In order to reduce the latency of accessing such(key, value) pairs, an in-memory cache system is usually deployed between the front-end Web system and the back-end database system. In practice, a cache system may consist of a number of server nodes, and fault tolerance is a critical feature to maintain the latency Service-Level Agreements(SLAs). In this paper, we present the design, implementation, analysis, and evaluation of R-Memcached, a reliable in-memory key-value cache system that is built on top of the popular Memcached software. R-Memcached exploits coding techniques to achieve reliability, and can tolerate up to two node failures.Our experimental results show that R-Memcached can maintain very good latency and throughput performance even during the period of node failures.
基金This work was supported by Youth Program of National Science Foundation of China(61702189).
文摘Many key-value stores use RDMA to optimize the messaging and data transmission between application layer and the storage layer,most of which only provide point-wise operations.Skiplist-based store can support both point operations and range queries,but its CPU-intensive access operations combined with the high-speed network will easily lead to the storage layer reaches CPU bottlenecks.The common solution to this problem is offloading some operations into the application layer and using RDMA bypassing CPU to directly perform remote access,but this method is only used in the hash tablebased store.In this paper,we present RS-store,a skiplist-based key-value store with RDMA,which can overcome the CPU handle of the storage layer by enabling two access modes:local access and remote access.In RS-store,we redesign a novel data structure R-skiplist to save the communication cost in remote access,and implement a latch-free concurrency control mechanism to ensure all the concurrency during two access modes.RS-store also supports client-active range query which can reduce the storage layer’s CPU consumption.At last,we evaluate RS-store on an RDMA-capable cluster.Experimental results show that RS-store achieves up to 2x improvements over RDMA-enabled RocksDB on the throughput and application’s scalability.
基金This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000202 and the National Natural Science Foundation of China under Grant Nos. 61303056 and 61379042.
文摘Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. Write- optimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and RocksDB. Conventional LSM-tree organizes KV items into multiple, successively larger components, and uses compaction to push KV items from one smaller component to another adjacent larger component until the KV items reach the largest component. Unfortunately, current compaction scheme incurs significant write amplification due to repeated KV item reads and writes, and then results in poor throughput. We propose a new compaction scheme, delayed compaction (dCompaction) that decreases write amplification, dCompaction postpones some compactions and gathers them into the following compaction. In this way, it avoids KV item reads and writes during compaction, and consequently improves the throughput of LSM-tree based KV stores. We implement dCompaction on RocksDB, and conduct extensive experiments. Validation using YCSB framework shows that compared with RocksDB, dCompaction has about 40% write performance improvements and also comparable read performance.
基金the National Key Basic Research and Development (973) Program of China (Nos. 2012CB315801 and 2011CB302805)the National Natural Science Foundation of China (Nos. 61161140320 and 61233016)Intel Research Council with the title of Security Vulnerability Analysis based on Cloud Platform with Intel IA Architecture
文摘With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Internet, the apps replace the PC client software as the major target of malicious usage. In this paper, to improve the security status of current mobile apps, we propose a methodology to evaluate mobile apps based on cloud computing platform and data mining. We also present a prototype system named MobSafe to identify the mobile app's virulence or benignancy. Compared with traditional method, such as permission pattern based method, MobSafe combines the dynamic and static analysis methods to comprehensively evaluate an Android app. In the implementation, we adopt Android Security Evaluation Framework (ASEF) and Static Android Analysis Framework (SAAF), the two representative dynamic and static analysis methods, to evaluate the Android apps and estimate the total time needed to evaluate all the apps stored in one mobile app market. Based on the real trace from a commercial mobile app market called AppChina, we can collect the statistics of the number of active Android apps, the average number apps installed in one Android device, and the expanding ratio of mobile apps. As mobile app market serves as the main line of defence against mobile malwares, our evaluation results show that it is practical to use cloud computing platform and data mining to verify all stored apps routinely to filter out malware apps from mobile app markets. As the future work, MobSafe can extensively use machine learning to conduct automotive forensic analysis of mobile apps based on the generated multifaceted data in this stage.