期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Using multi-threads to hide deduplication I/O latency with low synchronization overhead 被引量:1
1
作者 朱锐 秦磊华 +1 位作者 周敬利 郑寰 《Journal of Central South University》 SCIE EI CAS 2013年第6期1582-1591,共10页
Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the C... Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods. 展开更多
关键词 multi-thread multi-core parallel data deduplication
下载PDF
Partitioning the Conventional DBT System for Multiprocessors 被引量:1
2
作者 马汝辉 管海兵 +3 位作者 朱二周 杨洪波 杨吟冬 梁阿磊 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第3期474-490,共17页
Noticeable performance improvement via ever-increasing transistors is gradually trapped into a predicament since software cannot logically and efficiently utilize hardware resource, such as multi-core resource. This i... Noticeable performance improvement via ever-increasing transistors is gradually trapped into a predicament since software cannot logically and efficiently utilize hardware resource, such as multi-core resource. This is an inevitable problem in dynamic binary translation (DBT) system as well. Though special purpose hardware as aide tool, through some interfaces, provided by DBT enables the system to achieve higher performance, the limitation of it is significant, that is, it is impossible to be used widely by another one. To overcome this drawback, we focus on building compatible software architecture to acquire higher performance without platform dependence. In this paper, we propose a novel multithreaded architecture for DBT system through partitioning distinct function module, which is to adequately utilize multiprocessors resource. This new architecture devides couples the common DBT system (DBTs) working routine into dynamic translation, optimization, and translated code execution phases, and then ramifies them into different threads to enable them concurrently executed. In this new architecture, several efficient novel methods are presented to cope with intractable work that puzzles most researchers, such as communication mechanism, cache layout, and mutual exclusion between threads. Experimental results using SPECint 2000 indicate that this new architecture for DBT system can achieve higher performance -- speed up the traditional DBT system by about average 10.75%, with better CPU utilization. 展开更多
关键词 DBT OPTIMIZATION microprocessor multi-core SPECint 2000
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部