期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Privacy-Enhanced Data Deduplication Computational Intelligence Technique for Secure Healthcare Applications
1
作者 Jinsu Kim Sungwook Ryu Namje Park 《Computers, Materials & Continua》 SCIE EI 2022年第2期4169-4184,共16页
A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is requi... A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is required.As storage capacity increases,additional storage solutions are required.By leveraging deduplication,you can fundamentally solve the cost problem.However,deduplication poses privacy concerns due to the structure itself.In this paper,we point out the privacy infringement problemand propose a new deduplication technique to solve it.In the proposed technique,since the user’s map structure and files are not stored on the server,the file uploader list cannot be obtained through the server’s meta-information analysis,so the user’s privacy is maintained.In addition,the personal identification number(PIN)can be used to solve the file ownership problemand provides advantages such as safety against insider breaches and sniffing attacks.The proposed mechanism required an additional time of approximately 100 ms to add a IDRef to distinguish user-file during typical deduplication,and for smaller file sizes,the time required for additional operations is similar to the operation time,but relatively less time as the file’s capacity grows. 展开更多
关键词 Computational intelligence CLOUD MULTIMEDIA data deduplication
下载PDF
Metadata Feedback and Utilization for Data Deduplication Across WAN 被引量:2
2
作者 Bing Zhou Jiang-Tao Wen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第3期604-623,共20页
Data deduplication for file communication across wide area network (WAN) in the applications such as file synchronization and mirroring of cloud environments usually achieves significant bandwidth saving at the cost... Data deduplication for file communication across wide area network (WAN) in the applications such as file synchronization and mirroring of cloud environments usually achieves significant bandwidth saving at the cost of significant time overheads of data deduplication. The time overheads include the time required for data deduplication at two geographi- cally distributed nodes (e.g., disk access bottleneck) and the duplication query/answer operations between the sender and the receiver, since each query or answer introduces at least one round-trip time (RTT) of latency. In this paper, we present a data deduplication system across WAN with metadata feedback and metadata utilization (MFMU), in order to harness the data deduplication related time overheads. In the proposed MFMU system, selective metadata feedbacks from the receiver to the sender are introduced to reduce the number of duplication query/answer operations. In addition, to harness the metadata related disk I/O operations at the receiver, as well as the bandwidth overhead introduced by the metadata feedbacks, a hysteresis hash re-chunking mechanism based metadata utilization component is introduced. Our experimental results demonstrated that MFMU achieved an average of 20%~40% deduplication acceleration with the bandwidth saving ratio not reduced by the metadata feedbacks, as compared with the "baseline" content defined chunking (CDC) used in LBFS (Low-bandwith Network File system) and exiting state-of-the-art Bimodal chunking algorithms based data deduplication solutions. 展开更多
关键词 data deduplication wide area network (WAN) metadata feedback metadata utilization
原文传递
Improving Metadata Caching Efficiency for Data Deduplication via In-RAM Metadata Utilization
3
作者 Bing Zhou Jiang-Tao Wen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第4期805-819,共15页
We describe a data deduplication system for backup storage of PC disk images, named in-RAM metadata utilizing deduplication (IRwMUD). In-RAM hash granularity adaptation and miniLZO based data compression are firstly... We describe a data deduplication system for backup storage of PC disk images, named in-RAM metadata utilizing deduplication (IRwMUD). In-RAM hash granularity adaptation and miniLZO based data compression are firstly proposed to reduce the in-RAM metadata size and thereby reduce the space overheads required by the in-RAM metadata caches. Secondly, an in-RAM metadata write cache, as opposed to the traditional metadata read cache, is proposed for further reducing metadata-related disk I/O operations and improving deduplication throughput. During deduplication, the metadata write cache is managed following the LRU caching policy. For each manifest that is hit in the metadata write cache, an expensive manifest reloading operation from the disk is avoided. After deduplieation, all the manifests in the metadata write cache are cleared and stored on the disk. Our experimental results using 1.5 TB real-world disk image dataset show that I) IR-MUD achieved about 95% size reduction for the deduplication metadata, with a small time overhead introduced, 2) when the metadata write cache was not utilized, with the same RAM space size for the metadata read cache, IR-MUD achieved a 400% higher RAM hit ratio and a 50% higher deduplication throughput, as compared with the classic Sparse Indexing deduplication system where no metadata utilization approaches are utilized, and 3) when the metadata write cache was utilized and enough RAM space was available, IR-MUD achieved a 500% higher RAM hit ratio compared with Sparse Indexing and a 70% higher deduplication throughput compared with IR-MUD with only a single metadata read cache. The in-RAM metadata harnessing and metadata write caching approaches of IR-MUD can be applied in most parallel deduplication systems for improving metadata caching efficiency. 展开更多
关键词 data deduplication CACHE metadata utilization
原文传递
A Data Deduplication Framework of Disk Images with Adaptive Block Skipping
4
作者 Bing Zhou Jiang-Tao Wen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第4期820-835,共16页
We describe an efficient and easily applicable data deduplication framework with heuristic prediction based adaptive block skipping for the real-world dataset such as disk images to save deduplication related overhead... We describe an efficient and easily applicable data deduplication framework with heuristic prediction based adaptive block skipping for the real-world dataset such as disk images to save deduplication related overheads and improve deduplication throughput with good deduplication efficiency maintained. Under the framework, deduplication operations are skipped for data chunks determined as likely non-duplicates via heuristic prediction, in conjunction with a hit and matching extension process for duplication identification within skipped blocks and a hysteresis mechanism based hash indexing process to update the hash indices for the re-encountered skipped chunks. For performance evaluation, the proposed framework was integrated and implemented in the existing data domain and sparse indexing deduplication algorithms. The experimental results based on a real-world dataset of 1.0 TB disk images showed that the deduplication related overheads were significantly reduced with adaptive block skipping, leading to a 30%-80% improvement in deduplication throughput when deduplieation mctadata were stored on the disk for data domain, and 25%-40% RAM space saving with a 15%-20% improvement in deduplication throughput when an in-RAM sparse index was used in sparse indexing. In both cases, the corresponding deduplication ratios reduced were below 5%. 展开更多
关键词 data deduplication METAdata adaptive block skipping
原文传递
Smart data deduplication for telehealth systems in heterogeneous cloud computing
5
作者 GAI Keke QIU Meikang +1 位作者 SUN Xiaotong ZHAO Hui 《Journal of Communications and Information Networks》 2016年第4期93-104,共12页
The widespread application of heterogeneous cloud computing has enabled enormous advances in the real-time performance of telehealth systems.A cloud-based telehealth system allows healthcare users to obtain medical da... The widespread application of heterogeneous cloud computing has enabled enormous advances in the real-time performance of telehealth systems.A cloud-based telehealth system allows healthcare users to obtain medical data from various data sources supported by heterogeneous cloud providers.Employing data duplications in distributed cloud databases is an alternative approach for achieving data sharing among multiple data users.However,this approach results in additional storage space being used,even though reducing data duplications would lead to a decrease in data acquisitions and real-time performance.To address this issue,this paper focuses on developing a dynamic data deduplication method that uses an intelligent blocker to determine the working mode of data duplications for each data package in heterogeneous cloud-based telehealth systems.The proposed approach is named the SD2M(Smart Data Deduplication Model),in which the main algorithm applies dynamic programming to produce optimal solutions to minimizing the total cost of data usage.We implement experimental evaluations to examine the adaptability of the proposed approach. 展开更多
关键词 data deduplication TELEHEALTH heterogeneous cloud computing optimal solution dynamic programming
原文传递
Public Auditing for Encrypted Data with Client-Side Deduplication in Cloud Storage 被引量:4
6
作者 HE Kai HUANG Chuanhe +3 位作者 ZHOU Hao SHI Jiaoli WANG Xiaomao DAN Feng 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2015年第4期291-298,共8页
Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects... Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects together. However, these schemes either only support plaintext data file or have been proved insecure. In this paper, we propose a public auditing scheme for cloud storage systems, in which deduplication of encrypted data and data integrity checking can be achieved within the same framework. The cloud server can correctly check the ownership for new owners and the auditor can correctly check the integrity of deduplicated data. Our scheme supports deduplication of encrypted data by using the method of proxy re-encryption and also achieves deduplication of data tags by aggregating the tags from different owners. The analysis and experiment results show that our scheme is provably secure and efficient. 展开更多
关键词 public auditing data integrity storage deduplication cloud storage
原文传递
Endurable SSD-Based Read Cache for Improving the Performance of Selective Restore from Deduplication Systems
7
作者 Jian Liu Yun-Peng Chai +1 位作者 Xiao Qin Yao-Hong Liu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2018年第1期58-78,共21页
Deduplication has been commonly used in both enterprise storage systems and cloud storage. To overcome the performance challenge for the selective restore operations of deduplication systems, solid-state-drive-based ... Deduplication has been commonly used in both enterprise storage systems and cloud storage. To overcome the performance challenge for the selective restore operations of deduplication systems, solid-state-drive-based (i.e., SSD-based) re^d cache cm, be deployed for speeding up by caching popular restore contents dynamically. Unfortunately, frequent data updates induced by classical cache schemes (e.g., LRU and LFU) significantly shorten SSDs' lifetime while slowing down I/O processes in SSDs. To address this problem, we propose a new solution -- LOP-Cache to greatly improve tile write durability of SSDs as well as I/O performance by enlarging the proportion of long-term popular (LOP) data among data written into SSD-based cache. LOP-Cache keeps LOP data in the SSD cache for a long time period to decrease the number of cache replacements. Furthermore, it prevents unpopular or unnecessary data in deduplication containers from being written into the SSD cache. We implemented LOP-Cache in a prototype deduplication system to evaluate its pertbrmance. Our experimental results indicate that LOP-Cache shortens the latency of selective restore by an average of 37.3% at the cost of a small SSD-based cache with only 5.56% capacity of the deduplicated data. Importantly, LOP-Cache improves SSDs' lifetime by a factor of 9.77. The evidence shows that LOP-Cache offers a cost-efficient SSD-based read cache solution to boost performance of selective restore for deduplication systems. 展开更多
关键词 data deduplication solid state drive (SSD) flash CACHE ENDURANCE
原文传递
I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage
8
作者 Jibin Wang Zhigang Zhao +3 位作者 Zhaogang Xu Hu Zhang Liang Li Ying Guo 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2015年第1期17-27,共11页
Data deduplication is an emerging and widely employed method for current storage systems. As this technology is gradually applied in inline scenarios such as with virtual machines and cloud storage systems, this study... Data deduplication is an emerging and widely employed method for current storage systems. As this technology is gradually applied in inline scenarios such as with virtual machines and cloud storage systems, this study proposes a novel deduplication architecture called I-sieve. The goal of I-sieve is to realize a high performance data sieve system based on i SCSI in the cloud storage system. We also design the corresponding index and mapping tables and present a multi-level cache using a solid state drive to reduce RAM consumption and to optimize lookup performance. A prototype of I-sieve is implemented based on the open source i SCSI target, and many experiments have been conducted driven by virtual machine images and testing tools. The evaluation results show excellent deduplication and foreground performance. More importantly, I-sieve can co-exist with the existing deduplication systems as long as they support the i SCSI protocol. 展开更多
关键词 I-sieve cloud storage data deduplication
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部