In distributed storage systems,file access efficiency has an important impact on the real-time nature of information forensics.As a popular approach to improve file accessing efficiency,prefetching model can fetches d...In distributed storage systems,file access efficiency has an important impact on the real-time nature of information forensics.As a popular approach to improve file accessing efficiency,prefetching model can fetches data before it is needed according to the file access pattern,which can reduce the I/O waiting time and increase the system concurrency.However,prefetching model needs to mine the degree of association between files to ensure the accuracy of prefetching.In the massive small file situation,the sheer volume of files poses a challenge to the efficiency and accuracy of relevance mining.In this paper,we propose a massive files prefetching model based on LSTM neural network with cache transaction strategy to improve file access efficiency.Firstly,we propose a file clustering algorithm based on temporal locality and spatial locality to reduce the computational complexity.Secondly,we propose a definition of cache transaction according to files occurrence in cache instead of time-offset distance based methods to extract file block feature accurately.Lastly,we innovatively propose a file access prediction algorithm based on LSTM neural network which predict the file that have high possibility to be accessed.Experiments show that compared with the traditional LRU and the plain grouping methods,the proposed model notably increase the cache hit rate and effectively reduces the I/O wait time.展开更多
Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of...Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts.展开更多
With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficien...With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficient replication strategies to decrease retrieval time.However,the determination of replicas is not reasonable in many previous works,which incurs high response delay.To this end,a correlation-aware replica prefetching(CRP)strategy based on the file correlation principle is proposed,which can prefetch the files with high access probability.The key is to determine and obtain the implicit high-value files effectively,which has a significant impact on the performance of CRP.To achieve the goal of accelerating the acquisition of implicit highvalue files,an access rule management method based on consistent hashing is proposed,and then the storage and query mechanisms for access rules based on adjacency list storage structure are further presented.The theoretical analysis and simulation results corroborate that CRP shortens average response time over 4.8%,improves average hit ratio over 4.2%,reduces transmitting data amount over 8.3%,and maintains replication frequency at a reasonable level when compared to other schemes.展开更多
A new method of prefetching data blocks from the NVCache to the page cache in main memory and cascading prefetching n-blocks from a hard disk to the NVCache together was proposed to reduce the spin-up frequency of a h...A new method of prefetching data blocks from the NVCache to the page cache in main memory and cascading prefetching n-blocks from a hard disk to the NVCache together was proposed to reduce the spin-up frequency of a hybrid hard disk drive and thus enhance I/O performance.The proposed method consists of three steps:1) Analyzing the pattern of read requests in block units;2) Determining the number of blocks prefetched to the NVCache;3) Replacing blocks in the NVCache according to the block replacement policy.The proposed method can reduce the latency time of a hybrid hard disk and optimize the power consumption of an IPTV set-top box.Experimental results show that the proposed method provides better average response time compared to an existing adaptive multistream prefetching(AMP) method by 25.17%.It also reduces by 20.83% the average power consumption over that of the existing external caching in energy saving storage system(EXCES) method.展开更多
A novel approach that integrates occlusion culling within the view-dependent rendering framework is proposed. The algorithm uses the prioritized-layered projection(PLP) algorithm to occlude those obscured objects, a...A novel approach that integrates occlusion culling within the view-dependent rendering framework is proposed. The algorithm uses the prioritized-layered projection(PLP) algorithm to occlude those obscured objects, and uses an approximate visibility technique to accurately and efficiently determine which objects will be visible in the coming future and prefetch those objects from disk before they are rendered, view-dependent rendering technique provides the ability to change level of detail over the surface seamlessly and smoothly in real-time according to cell solidity value.展开更多
In this paper, a prefetching technique is proposed to solve the performance problem caused by remote data access delay. In the technique, the map tasks which will cause the delay are predicted first and then the input...In this paper, a prefetching technique is proposed to solve the performance problem caused by remote data access delay. In the technique, the map tasks which will cause the delay are predicted first and then the input data of these tasks will be preloaded before the tasks are scheduled. During the execution, the input data can be read from local nodes. Therefore, the delay can be hidden. The technique has been implemented in Hadoop-0. 20.1. The experiment results have shown that the technique reduces map tasks causing delay, and improves the performance of Hadoop MapRe- duce by 20%.展开更多
In this paper, we present a comparative study between informed and predictive prefetching mechanisms that were presented to leverage the performance gap between I/O storage systems and CPU. In particular, we will focu...In this paper, we present a comparative study between informed and predictive prefetching mechanisms that were presented to leverage the performance gap between I/O storage systems and CPU. In particular, we will focus on transparent informed prefetching (TIP) and predictive prefetching using probability graph approach (PG). Our main objective is to show the main features, motivations, and implementation overview of each mechanism. We also conducted a performance evaluation discussion that shows a comparison between both mechanisms performance when using different cache size values.展开更多
In this paper, we present a predictive prefetching mechanism that is based on probability graph approach to perform prefetching between different levels in a parallel hybrid storage system. The fundamental concept of ...In this paper, we present a predictive prefetching mechanism that is based on probability graph approach to perform prefetching between different levels in a parallel hybrid storage system. The fundamental concept of our approach is to invoke parallel hybrid storage system’s parallelism and prefetch data among multiple storage levels (e.g. solid state disks, and hard disk drives) in parallel with the application’s on-demand I/O reading requests. In this study, we show that a predictive prefetching across multiple storage levels is an efficient technique for placing near future needed data blocks in the uppermost levels near the application. Our PPHSS approach extends previous ideas of predictive prefetching in two ways: (1) our approach reduces applications’ execution elapsed time by keeping data blocks that are predicted to be accessed in the near future cached in the uppermost level;(2) we propose a parallel data fetching scheme in which multiple fetching mechanisms (i.e. predictive prefetching and application’s on-demand data requests) can work in parallel;where the first one fetches data blocks among the different levels of the hybrid storage systems (i.e. low-level (slow) to high-level (fast) storage devices) and the other one fetches the data from the storage system to the application. Our PPHSS strategy integrated with the predictive prefetching mechanism significantly reduces overall I/O access time in a hybrid storage system. Finally, we developed a simulator to evaluate the performance of the proposed predictive prefetching scheme in the context of hybrid storage systems. Our results show that our PPHSS can improve system performance by 4% across real-world I/O traces without the need of using large size caches.展开更多
A scheme for high-speed data transfer via the Internet for Web service in an extremely large delay environment is proposed. With the wide-spread use of Internet services in recent years, WLAN Internet service in high-...A scheme for high-speed data transfer via the Internet for Web service in an extremely large delay environment is proposed. With the wide-spread use of Internet services in recent years, WLAN Internet service in high-speed trains has commenced. The system for this is composed of a satellite communication system between the train and the ground station, which is characterized by extremely large latency of several hundred milliseconds due to long propagation latency. High-speed web access is not available to users in a train in such an extremely large latency network system. Thus, a prefetch scheme for performance acceleration of Web services in this environment is proposed. A test-bed system that implements the proposed scheme is implemented and is its performance in this test-bed is evaluated. The proposed scheme is verified to enable high-speed Web access in the extremely large delay environment compared to conventional schemes.展开更多
With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysi...With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysis of data processing behavior, an adaptive cache organization scheme is proposed with fast address calculation. This scheme can make full use of the characteristics of stack space data access, adopt fast address calculation strategy, and reduce the hit time of stack access. Adaptively, the stack cache can be turned off from beginning to end, when a stack overflow occurs to avoid the effect of stack switching on processor performance. Also, through the instruction cache and the failure behavior for the data cache, a prefetching policy is developed, which is combined with the data capture of the failover queue state. Finally, the proposed method can maintain the order of instruction and data access, which facilitates the extraction of prefetching in the end-to-end data processing.展开更多
Hadoop distributed file system (HDFS) as a popular cloud storage platform, benefiting from its scalable, reliable and low-cost storage capability.However it is mainly designed for batch processing of large files, it’...Hadoop distributed file system (HDFS) as a popular cloud storage platform, benefiting from its scalable, reliable and low-cost storage capability.However it is mainly designed for batch processing of large files, it’s mean that small files cannot be efficiently handled by HDFS. In this paper, we propose a mechanism to store small files in HDFS. In our approach, file size need to be judged before uploading to HDFS. If the file size is less than the size of the block, all correlated small files will be merged into one single file and we will build index for each small file. Furthermore, prefetching and caching mechanism are used to improve the reading efficiency of small files. Meanwhile, for the new small files, we can execute appending operation on the basis of merged file. Contrasting to original HDFS, experimental results show that the storage efficiency of small files is improved.展开更多
So far, file access prediction models is mainly based on either the file access frequency or the historical record of the latest access. In this paper, a new file access prediction model called frequency- and recency-...So far, file access prediction models is mainly based on either the file access frequency or the historical record of the latest access. In this paper, a new file access prediction model called frequency- and recency-based successor (FRS) is presented which combines the advantages of the file frequency with the historical record. FRS model has the capability of rapid response to workload changes and can predict future events with greater accuracy than most of other prediction models. To evaluate the performance of FRS mode, the Linux kernel is modified to predict and prefetch upcoming accesses. The experiment shows that FRS can accurately predict approximately 80% of all file access events, while maintaining an immediate successor queue (ISQ) per-file which only requires regular dynamic updates.展开更多
The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the...The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the primary approaches to reducing the user perceived access latency and improving the quality of services. In this paper, a Stochastic Petri Nets (SPN) based integrated web prefetching and caching model (IWPCM) is presented and the performance evaluation of IWPCM is made. The performance metrics, access latency, throughput, HR (hit ratio) and BHR (byte hit ratio) are analyzed and discussed. Simulations show that compared with caching only model (CM), IWPCM can further improve the throughput, HR and BHR efficiently and reduce the access latency. The performance evaluation based on the SPN model can provide a basis for implementation of web prefetching and caching and the combination of web prefetching and caching holds the promise of improving the QoS of web systems.展开更多
Docker,as a mainstream container solution,adopts the Copy-on-Write(CoW)mechanism in its storage drivers.This mechanism satisfies the need of different containers to share the same image.However,when a single container...Docker,as a mainstream container solution,adopts the Copy-on-Write(CoW)mechanism in its storage drivers.This mechanism satisfies the need of different containers to share the same image.However,when a single container performs operations such as modification of an image file,a duplicate is created in the upper readwrite layer,which contributes to the runtime overhead.When the accessed image file is fairly large,this additional overhead becomes non-negligible.Here we present the concept of Dynamic Prefetching Strategy Optimization(DPSO),which optimizes the Co W mechanism for a Docker container on the basis of the dynamic prefetching strategy.At the beginning of the container life cycle,DPSO pre-copies up the image files that are most likely to be copied up later to eliminate the overhead caused by performing this operation during application runtime.The experimental results show that DPSO has an average prefetch accuracy of greater than 78%in complex scenarios and could effectively eliminate the overhead caused by the CoW mechanism.展开更多
As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfo...As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfortunately, the predominant indexes -B^+-trees and T-trees -- have been shown to utilize cache poorly, which triggers the development of many cache-conscious indexes, such as CSB^+-trees and pB^+-trees. Most of these cache-conscious indexes are variants of conventional B^+-trees, and have better cache performance than B^+-trees. In this paper, we develop a novel J^+-tree index, inspired by the Judy structure which is an associative array data structure, and propose a more cacheoptimized index -- Prefetching J^+-tree (pJ^+-tree), which applies prefetching to J^+-tree to accelerate range scan operations. The J^+-tree stores all the keys in its leaf nodes and keeps the reference values of leaf nodes in a Judy structure, which makes J^+-tree not only hold the advantages of Judy (such as fast single value search) but also outperform it in other aspects. For example, J^+-trees can achieve better performance on range queries than Judy. The pJ^+-tree index exploits prefetching techniques to further improve the cache behavior of J^+-trees and yields a speedup of 2.0 on range scans. Compared with B^+-trees, CSB^+-trees, pB^+-trees and T-trees, our extensive experimental Study shows that pJ^+-trees can provide better performance on both time (search, scan, update) and space aspects.展开更多
基金This work is supported by‘The Fundamental Research Funds for the Central Universities(Grant No.HIT.NSRIF.201714)’‘Weihai Science and Technology Development Program(2016DXGJMS15)’‘Key Research and Development Program in Shandong Provincial(2017GGX90103)’.
文摘In distributed storage systems,file access efficiency has an important impact on the real-time nature of information forensics.As a popular approach to improve file accessing efficiency,prefetching model can fetches data before it is needed according to the file access pattern,which can reduce the I/O waiting time and increase the system concurrency.However,prefetching model needs to mine the degree of association between files to ensure the accuracy of prefetching.In the massive small file situation,the sheer volume of files poses a challenge to the efficiency and accuracy of relevance mining.In this paper,we propose a massive files prefetching model based on LSTM neural network with cache transaction strategy to improve file access efficiency.Firstly,we propose a file clustering algorithm based on temporal locality and spatial locality to reduce the computational complexity.Secondly,we propose a definition of cache transaction according to files occurrence in cache instead of time-offset distance based methods to extract file block feature accurately.Lastly,we innovatively propose a file access prediction algorithm based on LSTM neural network which predict the file that have high possibility to be accessed.Experiments show that compared with the traditional LRU and the plain grouping methods,the proposed model notably increase the cache hit rate and effectively reduces the I/O wait time.
基金The Ocean Public Welfare Scientific Research Project of State Oceanic Administration of China under contract No.20110533
文摘Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts.
基金the National Natural Science Foundation of China(No.61602525,No.61572525)the Research Foundation of Education Bureau of Hunan Province of China(No.19C1391)the Natural Science Foundation of Hunan Province of China(No.2020JJ5775)。
文摘With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficient replication strategies to decrease retrieval time.However,the determination of replicas is not reasonable in many previous works,which incurs high response delay.To this end,a correlation-aware replica prefetching(CRP)strategy based on the file correlation principle is proposed,which can prefetch the files with high access probability.The key is to determine and obtain the implicit high-value files effectively,which has a significant impact on the performance of CRP.To achieve the goal of accelerating the acquisition of implicit highvalue files,an access rule management method based on consistent hashing is proposed,and then the storage and query mechanisms for access rules based on adjacency list storage structure are further presented.The theoretical analysis and simulation results corroborate that CRP shortens average response time over 4.8%,improves average hit ratio over 4.2%,reduces transmitting data amount over 8.3%,and maintains replication frequency at a reasonable level when compared to other schemes.
基金supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2011-0004114)in part by the Ministry of Knowledge Economy (MKE) and Korea Institute for Advancement in Technology (KIAT) through the Workforce Development Program in Strategic Technology in part by the MKE (The Ministry of Knowledge Economy), Korea, under the CITRC (Convergence Information Technology Research Center) support program (NIPA-2012-C6150-1201-0001) supervised by the NIPA (National IT Industry Promotion Agency)
文摘A new method of prefetching data blocks from the NVCache to the page cache in main memory and cascading prefetching n-blocks from a hard disk to the NVCache together was proposed to reduce the spin-up frequency of a hybrid hard disk drive and thus enhance I/O performance.The proposed method consists of three steps:1) Analyzing the pattern of read requests in block units;2) Determining the number of blocks prefetched to the NVCache;3) Replacing blocks in the NVCache according to the block replacement policy.The proposed method can reduce the latency time of a hybrid hard disk and optimize the power consumption of an IPTV set-top box.Experimental results show that the proposed method provides better average response time compared to an existing adaptive multistream prefetching(AMP) method by 25.17%.It also reduces by 20.83% the average power consumption over that of the existing external caching in energy saving storage system(EXCES) method.
文摘A novel approach that integrates occlusion culling within the view-dependent rendering framework is proposed. The algorithm uses the prioritized-layered projection(PLP) algorithm to occlude those obscured objects, and uses an approximate visibility technique to accurately and efficiently determine which objects will be visible in the coming future and prefetch those objects from disk before they are rendered, view-dependent rendering technique provides the ability to change level of detail over the surface seamlessly and smoothly in real-time according to cell solidity value.
文摘In this paper, a prefetching technique is proposed to solve the performance problem caused by remote data access delay. In the technique, the map tasks which will cause the delay are predicted first and then the input data of these tasks will be preloaded before the tasks are scheduled. During the execution, the input data can be read from local nodes. Therefore, the delay can be hidden. The technique has been implemented in Hadoop-0. 20.1. The experiment results have shown that the technique reduces map tasks causing delay, and improves the performance of Hadoop MapRe- duce by 20%.
文摘In this paper, we present a comparative study between informed and predictive prefetching mechanisms that were presented to leverage the performance gap between I/O storage systems and CPU. In particular, we will focus on transparent informed prefetching (TIP) and predictive prefetching using probability graph approach (PG). Our main objective is to show the main features, motivations, and implementation overview of each mechanism. We also conducted a performance evaluation discussion that shows a comparison between both mechanisms performance when using different cache size values.
文摘In this paper, we present a predictive prefetching mechanism that is based on probability graph approach to perform prefetching between different levels in a parallel hybrid storage system. The fundamental concept of our approach is to invoke parallel hybrid storage system’s parallelism and prefetch data among multiple storage levels (e.g. solid state disks, and hard disk drives) in parallel with the application’s on-demand I/O reading requests. In this study, we show that a predictive prefetching across multiple storage levels is an efficient technique for placing near future needed data blocks in the uppermost levels near the application. Our PPHSS approach extends previous ideas of predictive prefetching in two ways: (1) our approach reduces applications’ execution elapsed time by keeping data blocks that are predicted to be accessed in the near future cached in the uppermost level;(2) we propose a parallel data fetching scheme in which multiple fetching mechanisms (i.e. predictive prefetching and application’s on-demand data requests) can work in parallel;where the first one fetches data blocks among the different levels of the hybrid storage systems (i.e. low-level (slow) to high-level (fast) storage devices) and the other one fetches the data from the storage system to the application. Our PPHSS strategy integrated with the predictive prefetching mechanism significantly reduces overall I/O access time in a hybrid storage system. Finally, we developed a simulator to evaluate the performance of the proposed predictive prefetching scheme in the context of hybrid storage systems. Our results show that our PPHSS can improve system performance by 4% across real-world I/O traces without the need of using large size caches.
文摘A scheme for high-speed data transfer via the Internet for Web service in an extremely large delay environment is proposed. With the wide-spread use of Internet services in recent years, WLAN Internet service in high-speed trains has commenced. The system for this is composed of a satellite communication system between the train and the ground station, which is characterized by extremely large latency of several hundred milliseconds due to long propagation latency. High-speed web access is not available to users in a train in such an extremely large latency network system. Thus, a prefetch scheme for performance acceleration of Web services in this environment is proposed. A test-bed system that implements the proposed scheme is implemented and is its performance in this test-bed is evaluated. The proposed scheme is verified to enable high-speed Web access in the extremely large delay environment compared to conventional schemes.
文摘With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysis of data processing behavior, an adaptive cache organization scheme is proposed with fast address calculation. This scheme can make full use of the characteristics of stack space data access, adopt fast address calculation strategy, and reduce the hit time of stack access. Adaptively, the stack cache can be turned off from beginning to end, when a stack overflow occurs to avoid the effect of stack switching on processor performance. Also, through the instruction cache and the failure behavior for the data cache, a prefetching policy is developed, which is combined with the data capture of the failover queue state. Finally, the proposed method can maintain the order of instruction and data access, which facilitates the extraction of prefetching in the end-to-end data processing.
文摘Hadoop distributed file system (HDFS) as a popular cloud storage platform, benefiting from its scalable, reliable and low-cost storage capability.However it is mainly designed for batch processing of large files, it’s mean that small files cannot be efficiently handled by HDFS. In this paper, we propose a mechanism to store small files in HDFS. In our approach, file size need to be judged before uploading to HDFS. If the file size is less than the size of the block, all correlated small files will be merged into one single file and we will build index for each small file. Furthermore, prefetching and caching mechanism are used to improve the reading efficiency of small files. Meanwhile, for the new small files, we can execute appending operation on the basis of merged file. Contrasting to original HDFS, experimental results show that the storage efficiency of small files is improved.
基金Supported by Key Technology R&D Project Foundation of Sichuan Province (No.02GG006-018)
文摘So far, file access prediction models is mainly based on either the file access frequency or the historical record of the latest access. In this paper, a new file access prediction model called frequency- and recency-based successor (FRS) is presented which combines the advantages of the file frequency with the historical record. FRS model has the capability of rapid response to workload changes and can predict future events with greater accuracy than most of other prediction models. To evaluate the performance of FRS mode, the Linux kernel is modified to predict and prefetch upcoming accesses. The experiment shows that FRS can accurately predict approximately 80% of all file access events, while maintaining an immediate successor queue (ISQ) per-file which only requires regular dynamic updates.
基金Supported by the National Natural Science Foundation of China under Grant No. 60472044. The authors would like to thank research fellow Dr. Yun Shi of China State Post Bureau and Professor Dr. Jun Zou of Tsinghua University for their helpful and constructive comments.
文摘The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the primary approaches to reducing the user perceived access latency and improving the quality of services. In this paper, a Stochastic Petri Nets (SPN) based integrated web prefetching and caching model (IWPCM) is presented and the performance evaluation of IWPCM is made. The performance metrics, access latency, throughput, HR (hit ratio) and BHR (byte hit ratio) are analyzed and discussed. Simulations show that compared with caching only model (CM), IWPCM can further improve the throughput, HR and BHR efficiently and reduce the access latency. The performance evaluation based on the SPN model can provide a basis for implementation of web prefetching and caching and the combination of web prefetching and caching holds the promise of improving the QoS of web systems.
基金supported by the National Key Research and Development Program of China(No.2018YFB1003203)the National Natural Science Foundation of China(Nos.61772218 and 61433019)+1 种基金the Outstanding Youth Foundation of Hubei Province(No.2016CFA032)the Chinese Universities Scientific Fund(No.2019kfyRCPY030)。
文摘Docker,as a mainstream container solution,adopts the Copy-on-Write(CoW)mechanism in its storage drivers.This mechanism satisfies the need of different containers to share the same image.However,when a single container performs operations such as modification of an image file,a duplicate is created in the upper readwrite layer,which contributes to the runtime overhead.When the accessed image file is fairly large,this additional overhead becomes non-negligible.Here we present the concept of Dynamic Prefetching Strategy Optimization(DPSO),which optimizes the Co W mechanism for a Docker container on the basis of the dynamic prefetching strategy.At the beginning of the container life cycle,DPSO pre-copies up the image files that are most likely to be copied up later to eliminate the overhead caused by performing this operation during application runtime.The experimental results show that DPSO has an average prefetch accuracy of greater than 78%in complex scenarios and could effectively eliminate the overhead caused by the CoW mechanism.
基金supported by a grant from HP Lab China,and the National Natural Science Foundation of China under Grant Nos.60496325 and 60573092
文摘As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfortunately, the predominant indexes -B^+-trees and T-trees -- have been shown to utilize cache poorly, which triggers the development of many cache-conscious indexes, such as CSB^+-trees and pB^+-trees. Most of these cache-conscious indexes are variants of conventional B^+-trees, and have better cache performance than B^+-trees. In this paper, we develop a novel J^+-tree index, inspired by the Judy structure which is an associative array data structure, and propose a more cacheoptimized index -- Prefetching J^+-tree (pJ^+-tree), which applies prefetching to J^+-tree to accelerate range scan operations. The J^+-tree stores all the keys in its leaf nodes and keeps the reference values of leaf nodes in a Judy structure, which makes J^+-tree not only hold the advantages of Judy (such as fast single value search) but also outperform it in other aspects. For example, J^+-trees can achieve better performance on range queries than Judy. The pJ^+-tree index exploits prefetching techniques to further improve the cache behavior of J^+-trees and yields a speedup of 2.0 on range scans. Compared with B^+-trees, CSB^+-trees, pB^+-trees and T-trees, our extensive experimental Study shows that pJ^+-trees can provide better performance on both time (search, scan, update) and space aspects.