Video surveillance applications need video data center to provide elastic virtual machine (VM) provisioning. However, the workloads of the VMs are hardly to be predicted for online video surveillance service. The un...Video surveillance applications need video data center to provide elastic virtual machine (VM) provisioning. However, the workloads of the VMs are hardly to be predicted for online video surveillance service. The unknown arrival workloads easily lead to workload skew among VMs. In this paper, we study how to balance the workload skew on online video surveillance system. First, we design the system framework for online surveillance service which con- sists of video capturing and analysis tasks. Second, we propose StreamTune, an online resource scheduling approach for workload balancing, to deal with irregular video analysis workload with the minimum number of VMs. We aim at timely balancing the workload skew on video analyzers without depending on any workload prediction method. Furthermore, we evaluate the performance of the proposed approach using a traffic surveillance application. The experimental results show that our approach is well adaptive to the variation of workload and achieves workload balance with less VMs.展开更多
Off-chip replacement (capacity and conflict) and coherent read misses in a distributed shared memory system cause execution to stall for hundreds of cycles. These off-chip replacement and coherent read misses are re...Off-chip replacement (capacity and conflict) and coherent read misses in a distributed shared memory system cause execution to stall for hundreds of cycles. These off-chip replacement and coherent read misses are recurring and forming sequences of two or more misses called streams. Prior streaming techniques ignored reordering of misses and not-recently-accessed streams while streaming data. In this paper, we present stream prefetcher design that can deal with both problems. Our stream prefetcher design utilizes stream waiting rooms to store not-recently-accessed streams. Stream waiting rooms help remove more off-chip misses. Using trace based simulation% our stream prefetcher design can remove 8% to 66% (on average 40%) and 17% to 63% (on average 39%) replacement and coherent read misses, respectively. Using cycle-accurate full-system simulation, our design gives speedups from 1.00 to 1.17 of princeton application repository for shared-memory computers (PARSEC) workloads running on a distributed shared memory system with the exception of dedup and swaptions workloads.展开更多
文摘Video surveillance applications need video data center to provide elastic virtual machine (VM) provisioning. However, the workloads of the VMs are hardly to be predicted for online video surveillance service. The unknown arrival workloads easily lead to workload skew among VMs. In this paper, we study how to balance the workload skew on online video surveillance system. First, we design the system framework for online surveillance service which con- sists of video capturing and analysis tasks. Second, we propose StreamTune, an online resource scheduling approach for workload balancing, to deal with irregular video analysis workload with the minimum number of VMs. We aim at timely balancing the workload skew on video analyzers without depending on any workload prediction method. Furthermore, we evaluate the performance of the proposed approach using a traffic surveillance application. The experimental results show that our approach is well adaptive to the variation of workload and achieves workload balance with less VMs.
基金supported by Higher Education Commission(Pakistan)National High Technology Research and Development Program of China(863 Program)(No.2008AA01A201)+1 种基金Natural Science Foundation of China(Nos.60833004 and 60970002)TNList Cross-discipline Foundation
文摘Off-chip replacement (capacity and conflict) and coherent read misses in a distributed shared memory system cause execution to stall for hundreds of cycles. These off-chip replacement and coherent read misses are recurring and forming sequences of two or more misses called streams. Prior streaming techniques ignored reordering of misses and not-recently-accessed streams while streaming data. In this paper, we present stream prefetcher design that can deal with both problems. Our stream prefetcher design utilizes stream waiting rooms to store not-recently-accessed streams. Stream waiting rooms help remove more off-chip misses. Using trace based simulation% our stream prefetcher design can remove 8% to 66% (on average 40%) and 17% to 63% (on average 39%) replacement and coherent read misses, respectively. Using cycle-accurate full-system simulation, our design gives speedups from 1.00 to 1.17 of princeton application repository for shared-memory computers (PARSEC) workloads running on a distributed shared memory system with the exception of dedup and swaptions workloads.