When examining the file transfer performance in a peer-to-peer file sharing system, a fundamental problem is how to describe the service rate for a file transfer. In this paper, the problem is examined by analyzing th...When examining the file transfer performance in a peer-to-peer file sharing system, a fundamental problem is how to describe the service rate for a file transfer. In this paper, the problem is examined by analyzing the distribution of server-like nodes' upstream-bandwidth among their concurrent transfers. A sufficient condition for the service rate, what a receiver obtains for downloading a file, to asymptotically be uniform is presented. On the aggregate service rate for transferring a file in a system, a sufficient condition for it to asymptotically follow a Zipf distribution is presented. These asymptotic equalities are both in the mean square sense. These analyses and the sufficient conditions provide a mathematic base for modeling file transfer processes in peer-to-peer file sharing systems.展开更多
A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk ac...A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk according to the similar service time. Firstly, the files were sorted and stored at the set I in descending order in terms of their service time, then one disk of cluster node was selected randomly when the files were to be assigned, and at last the continuous files were taken orderly from the set I to the disk until the disk reached its load maximum. The experimental results show that the new strategy improves the performance by 20.2% when the load of the system is light and by 31.6% when the load is heavy. And the higher the data access rate, the more evident the improvement of the performance obtained by the heuristic file sorted assignment algorithm.展开更多
In cloud computing,the number of replicas and deployment strategy have extensive impacts on user's requirement and storage efficiency.Therefore,in this paper,a new definition of file access popularity according to...In cloud computing,the number of replicas and deployment strategy have extensive impacts on user's requirement and storage efficiency.Therefore,in this paper,a new definition of file access popularity according to users' preferences,and its prediction algorithm are provided to predict file access trend with historical data.Files are sorted by priority depending on their popularity.A mathematical model between file access popularity and the number of replicas is built so that the reliability is increased efficiently.Most importantly,we present an optimal strategy of dynamic replicas deployment based on the file access popularity strategy with the overall concern of nodes' performance and load condition.By this strategy,files with high priority will be deployed on nodes with better performance therefore higher quality of service is guaranteed.The strategy is realized in the Hadoop platform.Performance is compared with that of default strategy in Hadoop and CDRM strategy.The result shows that the proposed strategy can not only maintain the system load balance,but also supply better service performance,which is consistent with the theoretical analysis.展开更多
基金National High Technology Research and Development Program of China (No.2007AA01Z457)Shanghai Science and Technology Development Fundation,China(No.07QA14033)
文摘When examining the file transfer performance in a peer-to-peer file sharing system, a fundamental problem is how to describe the service rate for a file transfer. In this paper, the problem is examined by analyzing the distribution of server-like nodes' upstream-bandwidth among their concurrent transfers. A sufficient condition for the service rate, what a receiver obtains for downloading a file, to asymptotically be uniform is presented. On the aggregate service rate for transferring a file in a system, a sufficient condition for it to asymptotically follow a Zipf distribution is presented. These asymptotic equalities are both in the mean square sense. These analyses and the sufficient conditions provide a mathematic base for modeling file transfer processes in peer-to-peer file sharing systems.
文摘A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk according to the similar service time. Firstly, the files were sorted and stored at the set I in descending order in terms of their service time, then one disk of cluster node was selected randomly when the files were to be assigned, and at last the continuous files were taken orderly from the set I to the disk until the disk reached its load maximum. The experimental results show that the new strategy improves the performance by 20.2% when the load of the system is light and by 31.6% when the load is heavy. And the higher the data access rate, the more evident the improvement of the performance obtained by the heuristic file sorted assignment algorithm.
基金Supported by the National Natural Science Foundation of China(No.61170209,61272508,61202432,61370132,61370092)
文摘In cloud computing,the number of replicas and deployment strategy have extensive impacts on user's requirement and storage efficiency.Therefore,in this paper,a new definition of file access popularity according to users' preferences,and its prediction algorithm are provided to predict file access trend with historical data.Files are sorted by priority depending on their popularity.A mathematical model between file access popularity and the number of replicas is built so that the reliability is increased efficiently.Most importantly,we present an optimal strategy of dynamic replicas deployment based on the file access popularity strategy with the overall concern of nodes' performance and load condition.By this strategy,files with high priority will be deployed on nodes with better performance therefore higher quality of service is guaranteed.The strategy is realized in the Hadoop platform.Performance is compared with that of default strategy in Hadoop and CDRM strategy.The result shows that the proposed strategy can not only maintain the system load balance,but also supply better service performance,which is consistent with the theoretical analysis.