Efcient cache management plays a vital role in in-memory dataparallel systems,such as Spark,Tez,Storm and HANA.Recent research,notably research on the Least Reference Count(LRC)and Most Reference Distance(MRD)policies...Efcient cache management plays a vital role in in-memory dataparallel systems,such as Spark,Tez,Storm and HANA.Recent research,notably research on the Least Reference Count(LRC)and Most Reference Distance(MRD)policies,has shown that dependency-aware caching management practices that consider the application’s directed acyclic graph(DAG)perform well in Spark.However,these practices ignore the further relationship between RDDs and cached some redundant RDDs with the same child RDDs,which degrades the memory performance.Hence,in memory-constrained situations,systems may encounter a performance bottleneck due to frequent data block replacement.In addition,the prefetch mechanisms in some cache management policies,such as MRD,are hard to trigger.In this paper,we propose a new cache management method called RDE(Redundant Data Eviction)that can fully utilize applications’DAG information to optimize the management result.By considering both RDDs’dependencies and the reference sequence,we effectively evict RDDs with redundant features and perfect the memory for incoming data blocks.Experiments show that RDE improves performance by an average of 55%compared to LRU and by up to 48%and 20%compared to LRC and MRD,respectively.RDE also shows less sensitivity to memory bottlenecks,which means better availability in memory-constrained environments.展开更多
A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity...A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity of contemporary high-performance spacecraft processors.To harness these non-uniform access behaviors,an efficient cache replacement framework featuring an auxiliary cache specifically designed to retain evicted hot data was proposed.This framework reconstructs the cache replacement policy,facilitating data migration between the main cache and the auxiliary cache.Unlike traditional cacheline-granularity policies,the approach excels at identifying and evicting infrequently used data,thereby optimizing cache utilization.The evaluation shows impressive performance improvement,especially on workloads with irregular access patterns.Benefiting from fine granularity,the proposal achieves superior storage efficiency compared with commonly used cache management schemes,providing a potential optimization opportunity for modern resource-constrained processors,such as spacecraft processors.Furthermore,the framework complements existing modern cache replacement policies and can be seamlessly integrated with minimal modifications,enhancing their overall efficacy.展开更多
Caching is an important technique to enhance the efficiency of query processing. Unfortunately, traditional caching mechanisms are not efficient for deep Web because of storage space and dynamic maintenance limitation...Caching is an important technique to enhance the efficiency of query processing. Unfortunately, traditional caching mechanisms are not efficient for deep Web because of storage space and dynamic maintenance limitations. In this paper, we present on providing a cache mechanism based on Top-K data source (KDS-CM) instead of result records for deep Web query. By integrating techniques from IR and Top-K, a data reorganization strategy is presented to model KDS-CM. Also some measures about cache management and optimization are proposed to improve the performances of cache effectively. Experimental results show the benefits of KDS-CM in execution cost and dynamic maintenance when compared with various alternate strategies.展开更多
In recent years,Delay Tolerant Networks(DTN)have received more and more attention.At the same time,several existing DTN routing algorithms generally have disadvantages such as poor scalability and inability to perceiv...In recent years,Delay Tolerant Networks(DTN)have received more and more attention.At the same time,several existing DTN routing algorithms generally have disadvantages such as poor scalability and inability to perceive changes in the network environment.This paper proposes an AdaptiveSpray routing algorithm.The algorithm can dynamically control the initial maximum message copy number according to the cache occupancy rate of the node itself,and the cache occupancy rate is added as an impact factor to the calculation of the probability of each node meeting the destination node.In the forwarding phase,the node will first compare the meeting probability of itself and the meeting node to the destination node,and then choose different forwarding strategies.The simulation shows that the AdaptiveSpray algorithm proposed in this paper has obvious advantages compared with the existing routing algorithms in terms of message delivery rate and average delay.展开更多
In this paper, we explore network architecture anal key technologies for content-centric networking (CCN), an emerging networking technology in the big-data era. We descrihe the structure anti operation mechanism of...In this paper, we explore network architecture anal key technologies for content-centric networking (CCN), an emerging networking technology in the big-data era. We descrihe the structure anti operation mechanism of tl CCN node. Then we discuss mobility management, routing strategy, and caching policy in CCN. For better network performance, we propose a probability cache replacement policy that is based on cotent popularity. We also propose and evaluate a probability cache with evicted copy-up decision policy.展开更多
Client cache is an important technology for the optimization of distributed and centralized storage systems. As a representative client cache system, the performance of CacheFiles is limited by transition faults. Furt...Client cache is an important technology for the optimization of distributed and centralized storage systems. As a representative client cache system, the performance of CacheFiles is limited by transition faults. Furthermore, CacheFiles just supports a simple LRU policy with a tightly-coupled design. To overcome these limitations, we propose to employ Stable Set Model (SSM) to improve CacheFiles and design an enhanced CacheFiles, SAC. SSM assumes that data access can be decomposed to access on some stable sets, in which elements are always repeatedly accessed or not accessed together. Using SSM methods can improve the cache management and reduce the effect of transition faults. We also adopt loosely- coupled methods to design prefetch and replacement policies. We implement our scheme on Linux 2.6.32 and measure the execution time of the scheme with various file I/O benchmarks. Experiments show that SAC can significantly improve I/O performance and reduce execution time up to 84%0, compared with the existing CacheFiles.展开更多
In modern energy-saving replication storage systems, a primary group of disks is always powered up to serve incoming requests while other disks are often spun down to save energy during slack periods. However, since n...In modern energy-saving replication storage systems, a primary group of disks is always powered up to serve incoming requests while other disks are often spun down to save energy during slack periods. However, since new writes cannot be immediately synchronized into all disks, system reliability is degraded. In this paper, we develop a high-reliability and energy-efficient replication storage system, named RERAID, based on RAID10. RERAID employs part of the free space in the primary disk group and uses erasure coding to construct a code cache at the front end to absorb new writes. Since code cache supports failure recovery of two or more disks by using erasure coding, RERAID guarantees a reliability comparable with that of the RAID10 storage system. In addition, we develop an algorithm, called erasure coding write (ECW), to buffer many small random writes into a few large writes, which are then written to the code cache in a parallel fashion sequentially to improve the write performance. Experimental results show that RERAID significantly improves write performance and saves more energy than existing solutions.展开更多
With the continuous development of network technology, the numberof streaming media videos is growing rapidly. More and more users are watchingvideos through the Internet, which leads to the increasing huge server loa...With the continuous development of network technology, the numberof streaming media videos is growing rapidly. More and more users are watchingvideos through the Internet, which leads to the increasing huge server load andthe increasing transmission cost across ISP domains. A feasible scheme to reducetransmission cost across ISP domains and alleviate the server load is to cachesome popular videos in a large number of terminal users. Therefore, in this paper,in order to utilize the idle resources of the terminal peers, some peers with goodperformance were selected from the fixed peers as the super peers, which wereaggregated into a super peer set (SPS). In addition, with the supply and demandrelation of streaming videos among ISP domains, a mathematical model was formulatedto optimize the service utility of ISP. Then, a collaborative cache strategywas proposed based on the utility optimization. The simulation results show thatthe strategy proposed can effectively improve the user playback fluency and hitrate while ensuring the optimal service utility.展开更多
基金supported by the National Natural Science Foundation of China under Grant 6110002。
文摘Efcient cache management plays a vital role in in-memory dataparallel systems,such as Spark,Tez,Storm and HANA.Recent research,notably research on the Least Reference Count(LRC)and Most Reference Distance(MRD)policies,has shown that dependency-aware caching management practices that consider the application’s directed acyclic graph(DAG)perform well in Spark.However,these practices ignore the further relationship between RDDs and cached some redundant RDDs with the same child RDDs,which degrades the memory performance.Hence,in memory-constrained situations,systems may encounter a performance bottleneck due to frequent data block replacement.In addition,the prefetch mechanisms in some cache management policies,such as MRD,are hard to trigger.In this paper,we propose a new cache management method called RDE(Redundant Data Eviction)that can fully utilize applications’DAG information to optimize the management result.By considering both RDDs’dependencies and the reference sequence,we effectively evict RDDs with redundant features and perfect the memory for incoming data blocks.Experiments show that RDE improves performance by an average of 55%compared to LRU and by up to 48%and 20%compared to LRC and MRD,respectively.RDE also shows less sensitivity to memory bottlenecks,which means better availability in memory-constrained environments.
文摘A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity of contemporary high-performance spacecraft processors.To harness these non-uniform access behaviors,an efficient cache replacement framework featuring an auxiliary cache specifically designed to retain evicted hot data was proposed.This framework reconstructs the cache replacement policy,facilitating data migration between the main cache and the auxiliary cache.Unlike traditional cacheline-granularity policies,the approach excels at identifying and evicting infrequently used data,thereby optimizing cache utilization.The evaluation shows impressive performance improvement,especially on workloads with irregular access patterns.Benefiting from fine granularity,the proposal achieves superior storage efficiency compared with commonly used cache management schemes,providing a potential optimization opportunity for modern resource-constrained processors,such as spacecraft processors.Furthermore,the framework complements existing modern cache replacement policies and can be seamlessly integrated with minimal modifications,enhancing their overall efficacy.
基金Supported by the National Natural Science Foundation of China (60673139, 60473073, 60573090)
文摘Caching is an important technique to enhance the efficiency of query processing. Unfortunately, traditional caching mechanisms are not efficient for deep Web because of storage space and dynamic maintenance limitations. In this paper, we present on providing a cache mechanism based on Top-K data source (KDS-CM) instead of result records for deep Web query. By integrating techniques from IR and Top-K, a data reorganization strategy is presented to model KDS-CM. Also some measures about cache management and optimization are proposed to improve the performances of cache effectively. Experimental results show the benefits of KDS-CM in execution cost and dynamic maintenance when compared with various alternate strategies.
基金National Key R&D Program of China(2020YFB1807805,2020YFB1807800)CERNET Innovation Project(NGII20190806).
文摘In recent years,Delay Tolerant Networks(DTN)have received more and more attention.At the same time,several existing DTN routing algorithms generally have disadvantages such as poor scalability and inability to perceive changes in the network environment.This paper proposes an AdaptiveSpray routing algorithm.The algorithm can dynamically control the initial maximum message copy number according to the cache occupancy rate of the node itself,and the cache occupancy rate is added as an impact factor to the calculation of the probability of each node meeting the destination node.In the forwarding phase,the node will first compare the meeting probability of itself and the meeting node to the destination node,and then choose different forwarding strategies.The simulation shows that the AdaptiveSpray algorithm proposed in this paper has obvious advantages compared with the existing routing algorithms in terms of message delivery rate and average delay.
基金supported by National Natural Science Foundation of China under Grant No.60872018 and No. 60902015Major National Science and Technology Project No. 2011ZX03005-004-03
文摘In this paper, we explore network architecture anal key technologies for content-centric networking (CCN), an emerging networking technology in the big-data era. We descrihe the structure anti operation mechanism of tl CCN node. Then we discuss mobility management, routing strategy, and caching policy in CCN. For better network performance, we propose a probability cache replacement policy that is based on cotent popularity. We also propose and evaluate a probability cache with evicted copy-up decision policy.
基金supported by the National Basic Research 973 Program of China under Grant No.2011CB302304the National HighTechnology Research and Development 863 Program of China under Grant Nos.2011AA01A102,2013AA013201 and 2013AA013205+1 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No.XDA06010401the Chinese Academyof Sciences Key Deployment project under Grant No.KGZD-EW-103-5(7)
文摘Client cache is an important technology for the optimization of distributed and centralized storage systems. As a representative client cache system, the performance of CacheFiles is limited by transition faults. Furthermore, CacheFiles just supports a simple LRU policy with a tightly-coupled design. To overcome these limitations, we propose to employ Stable Set Model (SSM) to improve CacheFiles and design an enhanced CacheFiles, SAC. SSM assumes that data access can be decomposed to access on some stable sets, in which elements are always repeatedly accessed or not accessed together. Using SSM methods can improve the cache management and reduce the effect of transition faults. We also adopt loosely- coupled methods to design prefetch and replacement policies. We implement our scheme on Linux 2.6.32 and measure the execution time of the scheme with various file I/O benchmarks. Experiments show that SAC can significantly improve I/O performance and reduce execution time up to 84%0, compared with the existing CacheFiles.
基金Project supported by the National Natural Science Foundation of China (Nos. 61472152, 614320{37, 61572209, and 61300047), the Fundamental Research Funds for the Central Universities, China (No. 2015QN069), the Director Fund of Wuhan National Laboratory for Optoelectronics (WNLO), and the MOE Key Laboratory of Data Storage System, China
文摘In modern energy-saving replication storage systems, a primary group of disks is always powered up to serve incoming requests while other disks are often spun down to save energy during slack periods. However, since new writes cannot be immediately synchronized into all disks, system reliability is degraded. In this paper, we develop a high-reliability and energy-efficient replication storage system, named RERAID, based on RAID10. RERAID employs part of the free space in the primary disk group and uses erasure coding to construct a code cache at the front end to absorb new writes. Since code cache supports failure recovery of two or more disks by using erasure coding, RERAID guarantees a reliability comparable with that of the RAID10 storage system. In addition, we develop an algorithm, called erasure coding write (ECW), to buffer many small random writes into a few large writes, which are then written to the code cache in a parallel fashion sequentially to improve the write performance. Experimental results show that RERAID significantly improves write performance and saves more energy than existing solutions.
基金This research was supported by the national key research and development program of China(No.2020YFF0305301)the National Natural Science Foundation(61762029,U1811264).References。
文摘With the continuous development of network technology, the numberof streaming media videos is growing rapidly. More and more users are watchingvideos through the Internet, which leads to the increasing huge server load andthe increasing transmission cost across ISP domains. A feasible scheme to reducetransmission cost across ISP domains and alleviate the server load is to cachesome popular videos in a large number of terminal users. Therefore, in this paper,in order to utilize the idle resources of the terminal peers, some peers with goodperformance were selected from the fixed peers as the super peers, which wereaggregated into a super peer set (SPS). In addition, with the supply and demandrelation of streaming videos among ISP domains, a mathematical model was formulatedto optimize the service utility of ISP. Then, a collaborative cache strategywas proposed based on the utility optimization. The simulation results show thatthe strategy proposed can effectively improve the user playback fluency and hitrate while ensuring the optimal service utility.