In cloud storage,client-side deduplication is widely used to reduce storage and communication costs.In client-side deduplication,if the cloud server detects that the user’s outsourced data have been stored,then clien...In cloud storage,client-side deduplication is widely used to reduce storage and communication costs.In client-side deduplication,if the cloud server detects that the user’s outsourced data have been stored,then clients will not need to reupload the data.However,the information on whether data need to be uploaded can be used as a side-channel,which can consequently be exploited by adversaries to compromise data privacy.In this paper,we propose a new threat model against side-channel attacks.Different from existing schemes,the adversary could learn the approximate ratio of stored chunks to unstored chunks in outsourced files,and this ratio will affect the probability that the adversary compromises the data privacy through side-channel attacks.Under this threat model,we design two defense schemes to minimize privacy leakage,both of which design interaction protocols between clients and the server during deduplication checks to reduce the probability that the adversary compromises data privacy.We analyze the security of our schemes,and evaluate their performances based on a real-world dataset.Compared with existing schemes,our schemes can better mitigate data privacy leakage and have a slightly lower communication cost.展开更多
Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects...Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects together. However, these schemes either only support plaintext data file or have been proved insecure. In this paper, we propose a public auditing scheme for cloud storage systems, in which deduplication of encrypted data and data integrity checking can be achieved within the same framework. The cloud server can correctly check the ownership for new owners and the auditor can correctly check the integrity of deduplicated data. Our scheme supports deduplication of encrypted data by using the method of proxy re-encryption and also achieves deduplication of data tags by aggregating the tags from different owners. The analysis and experiment results show that our scheme is provably secure and efficient.展开更多
Virtualization is the backbone of cloud computing,which is a developing and widely used paradigm.Byfinding and merging identical memory pages,memory deduplication improves memory efficiency in virtualized systems.Kern...Virtualization is the backbone of cloud computing,which is a developing and widely used paradigm.Byfinding and merging identical memory pages,memory deduplication improves memory efficiency in virtualized systems.Kernel Same Page Merging(KSM)is a Linux service for memory pages sharing in virtualized environments.Memory deduplication is vulnerable to a memory disclosure attack,which uses covert channel establishment to reveal the contents of other colocated virtual machines.To avoid a memory disclosure attack,sharing of identical pages within a single user’s virtual machine is permitted,but sharing of contents between different users is forbidden.In our proposed approach,virtual machines with similar operating systems of active domains in a node are recognised and organised into a homogenous batch,with memory deduplication performed inside that batch,to improve the memory pages sharing efficiency.When compared to memory deduplication applied to the entire host,implementation details demonstrate a significant increase in the number of pages shared when memory deduplication applied batch-wise and CPU(Central processing unit)consumption also increased.展开更多
Cloud storage is essential for managing user data to store and retrieve from the distributed data centre.The storage service is distributed as pay a service for accessing the size to collect the data.Due to the massiv...Cloud storage is essential for managing user data to store and retrieve from the distributed data centre.The storage service is distributed as pay a service for accessing the size to collect the data.Due to the massive amount of data stored in the data centre containing similar information and file structures remaining in multi-copy,duplication leads to increase storage space.The potential deduplication system doesn’t make efficient data reduction because of inaccuracy in finding similar data analysis.It creates a complex nature to increase the storage consumption under cost.To resolve this problem,this paper proposes an efficient storage reduction called Hash-Indexing Block-based Deduplication(HIBD)based on Segmented Bind Linkage(SBL)Methods for reducing storage in a cloud environment.Initially,preprocessing is done using the sparse augmentation technique.Further,the preprocessed files are segmented into blocks to make Hash-Index.The block of the contents is compared with other files through Semantic Content Source Deduplication(SCSD),which identifies the similar content presence between the file.Based on the content presence count,the Distance Vector Weightage Correlation(DVWC)estimates the document similarity weight,and related files are grouped into a cluster.Finally,the segmented bind linkage compares the document to find duplicate content in the cluster using similarity weight based on the coefficient match case.This implementation helps identify the data redundancy efficiently and reduces the service cost in distributed cloud storage.展开更多
Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data ...Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data is stored in the cloud-fog storage environments.This cloud-Fog based health model allows the users to get health-related data from different sources,and duplicated informa-tion is also available in the background.Therefore,it requires an additional sto-rage area,increase in data acquisition time,and insecure data replication in the environment.This paper is proposed to eliminate the de-duplication data using a window size chunking algorithm with a biased sampling-based bloomfilter and provide the health data security using the Advanced Signature-Based Encryp-tion(ASE)algorithm in the Fog-Cloud Environment(WCA-BF+ASE).This WCA-BF+ASE eliminates the duplicate copy of the data and minimizes its sto-rage space and maintenance cost.The data is also stored in an efficient and in a highly secured manner.The security level in the cloud storage environment Win-dows Chunking Algorithm(WSCA)has got 86.5%,two thresholds two divisors(TTTD)80%,Ordinal in Python(ORD)84.4%,Boom Filter(BF)82%,and the proposed work has got better security storage of 97%.And also,after applying the de-duplication process,the proposed method WCA-BF+ASE has required only less storage space for variousfile sizes of 10 KB for 200,400 MB has taken only 22 KB,and 600 MB has required 35 KB,800 MB has consumed only 38 KB,1000 MB has taken 40 KB of storage spaces.展开更多
Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the C...Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.展开更多
The tremendous development of cloud computing with related technol-ogies is an unexpected one.However,centralized cloud storage faces few chal-lenges such as latency,storage,and packet drop in the network.Cloud storag...The tremendous development of cloud computing with related technol-ogies is an unexpected one.However,centralized cloud storage faces few chal-lenges such as latency,storage,and packet drop in the network.Cloud storage gets more attention due to its huge data storage and ensures the security of secret information.Most of the developments in cloud storage have been positive except better cost model and effectiveness,but still data leakage in security are billion-dollar questions to consumers.Traditional data security techniques are usually based on cryptographic methods,but these approaches may not be able to with-stand an attack from the cloud server's interior.So,we suggest a model called multi-layer storage(MLS)based on security using elliptical curve cryptography(ECC).The suggested model focuses on the significance of cloud storage along with data protection and removing duplicates at the initial level.Based on divide and combine methodologies,the data are divided into three parts.Here,thefirst two portions of data are stored in the local system and fog nodes to secure the data using the encoding and decoding technique.The other part of the encrypted data is saved in the cloud.The viability of our model has been tested by research in terms of safety measures and test evaluation,and it is truly a powerful comple-ment to existing methods in cloud storage.展开更多
Modern backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication.Fragmentation degrades the restore performance because of res...Modern backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication.Fragmentation degrades the restore performance because of restoring the chunks thatare scattered all over different containers. To improve the restore performance, thestate-of-the-art History Aware Rewriting Algorithm(HAR) is proposed to collect frag-mented chunks in the last backup and rewrite them in the next backup. However, dueto rewriting fragmented chunks in the next backup, HAR fails to eliminate internalfragmentation caused by self-referenced chunks(that exist more than two times in abackup) in the current backup, thus degrading the restore performance. In this paper,we propose Selectively Rewriting Self-Referenced Chunks(SRSC), a scheme that de-signs a buffer to simulate a restore cache, identify internal fragmentation in the cacheand selectively rewrite them. Our experimental results based on two real-world datas-ets show that SRSC improves the restore performance by 45% with an acceptable sac-rifice of the deduplication ratio.展开更多
In deduplication, index-lookup disk bottleneck is a major obstacle which limits the throughput of backup processes. One way to minimize the effect of this issue and boost speed is to use very high course-grained chunk...In deduplication, index-lookup disk bottleneck is a major obstacle which limits the throughput of backup processes. One way to minimize the effect of this issue and boost speed is to use very high course-grained chunks for deduplication at a cost of low storage saving and limited scalability. Another way is to distribute the deduplication process among multiple nodes but this approach introduces storage node island effect and also incurs high communication cost. In this paper, we explore dCACH, a content-aware clustered and hierarchical deduplication system, which implements a hybrid of inline course grained and offline fine-grained distributed deduplication where routing decisions are made for a set of files instead of single files. It utilizes bloom filters for detecting similarity between a data stream and previous data streams and performs stateful routing which solves the storage node island problem. Moreover, it exploits the negligibly small amount of content shared among chunks from different file types to create groups of files and deduplicate each group in their own fingerprint index space. It implements hierarchical deduplication to reduce the size of fingerprint indexes at the global level, where only files and big sized segments are deduplicated. Locality is created and exploited first using the big sized segments deduplicated at the global level and second by routing a set of consecutive files together to one storage node. Furthermore, the use of bloom filter for similarity detection between streams has low communication and computation cost while it enables to achieve duplicate elimination performance comparable to single node deduplication. dCACH is evaluated using a prototype deployed on a server environment distributed over four separate machines. It is shown to have 10× the speed of Extreme_Binn with a minimal communication overhead, while its duplicate elimination effectiveness is on a par with a single node deduplication system.展开更多
A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is requi...A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is required.As storage capacity increases,additional storage solutions are required.By leveraging deduplication,you can fundamentally solve the cost problem.However,deduplication poses privacy concerns due to the structure itself.In this paper,we point out the privacy infringement problemand propose a new deduplication technique to solve it.In the proposed technique,since the user’s map structure and files are not stored on the server,the file uploader list cannot be obtained through the server’s meta-information analysis,so the user’s privacy is maintained.In addition,the personal identification number(PIN)can be used to solve the file ownership problemand provides advantages such as safety against insider breaches and sniffing attacks.The proposed mechanism required an additional time of approximately 100 ms to add a IDRef to distinguish user-file during typical deduplication,and for smaller file sizes,the time required for additional operations is similar to the operation time,but relatively less time as the file’s capacity grows.展开更多
In architecture of cloud storage, the deduplication technology encrypted with theconvergent key is one of the important data compression technologies, which effectively improvesthe utilization of space and bandwidth. ...In architecture of cloud storage, the deduplication technology encrypted with theconvergent key is one of the important data compression technologies, which effectively improvesthe utilization of space and bandwidth. To further refine the usage scenarios for varioususer permissions and enhance user’s data security, we propose a blockchain-based differentialauthorized deduplication system. The proposed system optimizes the traditionalProof of Vote (PoV) consensus algorithm and simplifies the existing differential authorizationprocess to realize credible management and dynamic update of authority. Based on thedecentralized property of blockchain, we overcome the centralized single point fault problemof traditional differentially authorized deduplication system. Besides, the operations oflegitimate users are recorded in blocks to ensure the traceability of behaviors.展开更多
Cloud computing technology is the culmination of technical advancements in computer networks,hardware and software capabilities that collectively gave rise to computing as a utility.It offers a plethora of utilities t...Cloud computing technology is the culmination of technical advancements in computer networks,hardware and software capabilities that collectively gave rise to computing as a utility.It offers a plethora of utilities to its clients worldwide in a very cost-effective way and this feature is enticing users/companies to migrate their infrastructure to cloud platform.Swayed by its gigantic capacity and easy access clients are uploading replicated data on cloud resulting in an unnecessary crunch of storage in datacenters.Many data compression techniques came to rescue but none could serve the purpose for the capacity as large as a cloud,hence,researches were made to de-duplicate the data and harvest the space from exiting storage capacity which was going in vain due to duplicacy of data.For providing better cloud services through scalable provisioning of resources,interoperability has brought many Cloud Service Providers(CSPs)under one umbrella and termed it as Cloud Federation.Many policies have been devised for private and public cloud deployment models for searching/eradicating replicated copies using hashing techniques.Whereas the exploration for duplicate copies is not restricted to any one type of CSP but to a set of public or private CSPs contributing to the federation.It was found that even in advanced deduplication techniques for federated clouds,due to the different nature of CSPs,a single file is stored at private as well as public group in the same cloud federation which can be handled if an optimized deduplication strategy be rendered for addressing this issue.Therefore,this study has been aimed to further optimize a deduplication strategy for federated cloud environment and suggested a central management agent for the federation.It was perceived that work relevant to this is not existing,hence,in this paper,the concept of federation agent has been implemented and deduplication technique following file level has been used for the accomplishment of this approach.展开更多
基金supported by the National Key R&D Program of China (No.2018YFA0704703)National Natural Science Foundation of China (Nos.61972215,61972073,and 62172238)Natural Science Foundation of Tianjin (No.20JCZDJC00640).
文摘In cloud storage,client-side deduplication is widely used to reduce storage and communication costs.In client-side deduplication,if the cloud server detects that the user’s outsourced data have been stored,then clients will not need to reupload the data.However,the information on whether data need to be uploaded can be used as a side-channel,which can consequently be exploited by adversaries to compromise data privacy.In this paper,we propose a new threat model against side-channel attacks.Different from existing schemes,the adversary could learn the approximate ratio of stored chunks to unstored chunks in outsourced files,and this ratio will affect the probability that the adversary compromises the data privacy through side-channel attacks.Under this threat model,we design two defense schemes to minimize privacy leakage,both of which design interaction protocols between clients and the server during deduplication checks to reduce the probability that the adversary compromises data privacy.We analyze the security of our schemes,and evaluate their performances based on a real-world dataset.Compared with existing schemes,our schemes can better mitigate data privacy leakage and have a slightly lower communication cost.
基金Supported by the National Natural Science Foundation of China(61373040,61173137)the Ph.D.Programs Foundation of Ministry of Education of China(20120141110002)the Key Project of Natural Science Foundation of Hubei Province(2010CDA004)
文摘Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects together. However, these schemes either only support plaintext data file or have been proved insecure. In this paper, we propose a public auditing scheme for cloud storage systems, in which deduplication of encrypted data and data integrity checking can be achieved within the same framework. The cloud server can correctly check the ownership for new owners and the auditor can correctly check the integrity of deduplicated data. Our scheme supports deduplication of encrypted data by using the method of proxy re-encryption and also achieves deduplication of data tags by aggregating the tags from different owners. The analysis and experiment results show that our scheme is provably secure and efficient.
文摘Virtualization is the backbone of cloud computing,which is a developing and widely used paradigm.Byfinding and merging identical memory pages,memory deduplication improves memory efficiency in virtualized systems.Kernel Same Page Merging(KSM)is a Linux service for memory pages sharing in virtualized environments.Memory deduplication is vulnerable to a memory disclosure attack,which uses covert channel establishment to reveal the contents of other colocated virtual machines.To avoid a memory disclosure attack,sharing of identical pages within a single user’s virtual machine is permitted,but sharing of contents between different users is forbidden.In our proposed approach,virtual machines with similar operating systems of active domains in a node are recognised and organised into a homogenous batch,with memory deduplication performed inside that batch,to improve the memory pages sharing efficiency.When compared to memory deduplication applied to the entire host,implementation details demonstrate a significant increase in the number of pages shared when memory deduplication applied batch-wise and CPU(Central processing unit)consumption also increased.
文摘Cloud storage is essential for managing user data to store and retrieve from the distributed data centre.The storage service is distributed as pay a service for accessing the size to collect the data.Due to the massive amount of data stored in the data centre containing similar information and file structures remaining in multi-copy,duplication leads to increase storage space.The potential deduplication system doesn’t make efficient data reduction because of inaccuracy in finding similar data analysis.It creates a complex nature to increase the storage consumption under cost.To resolve this problem,this paper proposes an efficient storage reduction called Hash-Indexing Block-based Deduplication(HIBD)based on Segmented Bind Linkage(SBL)Methods for reducing storage in a cloud environment.Initially,preprocessing is done using the sparse augmentation technique.Further,the preprocessed files are segmented into blocks to make Hash-Index.The block of the contents is compared with other files through Semantic Content Source Deduplication(SCSD),which identifies the similar content presence between the file.Based on the content presence count,the Distance Vector Weightage Correlation(DVWC)estimates the document similarity weight,and related files are grouped into a cluster.Finally,the segmented bind linkage compares the document to find duplicate content in the cluster using similarity weight based on the coefficient match case.This implementation helps identify the data redundancy efficiently and reduces the service cost in distributed cloud storage.
文摘Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data is stored in the cloud-fog storage environments.This cloud-Fog based health model allows the users to get health-related data from different sources,and duplicated informa-tion is also available in the background.Therefore,it requires an additional sto-rage area,increase in data acquisition time,and insecure data replication in the environment.This paper is proposed to eliminate the de-duplication data using a window size chunking algorithm with a biased sampling-based bloomfilter and provide the health data security using the Advanced Signature-Based Encryp-tion(ASE)algorithm in the Fog-Cloud Environment(WCA-BF+ASE).This WCA-BF+ASE eliminates the duplicate copy of the data and minimizes its sto-rage space and maintenance cost.The data is also stored in an efficient and in a highly secured manner.The security level in the cloud storage environment Win-dows Chunking Algorithm(WSCA)has got 86.5%,two thresholds two divisors(TTTD)80%,Ordinal in Python(ORD)84.4%,Boom Filter(BF)82%,and the proposed work has got better security storage of 97%.And also,after applying the de-duplication process,the proposed method WCA-BF+ASE has required only less storage space for variousfile sizes of 10 KB for 200,400 MB has taken only 22 KB,and 600 MB has required 35 KB,800 MB has consumed only 38 KB,1000 MB has taken 40 KB of storage spaces.
基金Project(IRT0725)supported by the Changjiang Innovative Group of Ministry of Education,China
文摘Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.
文摘The tremendous development of cloud computing with related technol-ogies is an unexpected one.However,centralized cloud storage faces few chal-lenges such as latency,storage,and packet drop in the network.Cloud storage gets more attention due to its huge data storage and ensures the security of secret information.Most of the developments in cloud storage have been positive except better cost model and effectiveness,but still data leakage in security are billion-dollar questions to consumers.Traditional data security techniques are usually based on cryptographic methods,but these approaches may not be able to with-stand an attack from the cloud server's interior.So,we suggest a model called multi-layer storage(MLS)based on security using elliptical curve cryptography(ECC).The suggested model focuses on the significance of cloud storage along with data protection and removing duplicates at the initial level.Based on divide and combine methodologies,the data are divided into three parts.Here,thefirst two portions of data are stored in the local system and fog nodes to secure the data using the encoding and decoding technique.The other part of the encrypted data is saved in the cloud.The viability of our model has been tested by research in terms of safety measures and test evaluation,and it is truly a powerful comple-ment to existing methods in cloud storage.
基金supported in part by ZTE Industry-Academia-Research Cooperation Fundsthe National Natural Science Foundation of China under Grant Nos.61502191,61502190,61602197,and 61772222)+2 种基金Fundamental Research Funds for the Central Universities under Grant Nos.2017KFYXJJ065 and 2016YXMS085the Hubei Provincial Natural Science Foundation of China under Grant Nos.2016CFB226 and2016CFB192Key Laboratory of Information Storage System Ministry of Education of China
文摘Modern backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication.Fragmentation degrades the restore performance because of restoring the chunks thatare scattered all over different containers. To improve the restore performance, thestate-of-the-art History Aware Rewriting Algorithm(HAR) is proposed to collect frag-mented chunks in the last backup and rewrite them in the next backup. However, dueto rewriting fragmented chunks in the next backup, HAR fails to eliminate internalfragmentation caused by self-referenced chunks(that exist more than two times in abackup) in the current backup, thus degrading the restore performance. In this paper,we propose Selectively Rewriting Self-Referenced Chunks(SRSC), a scheme that de-signs a buffer to simulate a restore cache, identify internal fragmentation in the cacheand selectively rewrite them. Our experimental results based on two real-world datas-ets show that SRSC improves the restore performance by 45% with an acceptable sac-rifice of the deduplication ratio.
文摘In deduplication, index-lookup disk bottleneck is a major obstacle which limits the throughput of backup processes. One way to minimize the effect of this issue and boost speed is to use very high course-grained chunks for deduplication at a cost of low storage saving and limited scalability. Another way is to distribute the deduplication process among multiple nodes but this approach introduces storage node island effect and also incurs high communication cost. In this paper, we explore dCACH, a content-aware clustered and hierarchical deduplication system, which implements a hybrid of inline course grained and offline fine-grained distributed deduplication where routing decisions are made for a set of files instead of single files. It utilizes bloom filters for detecting similarity between a data stream and previous data streams and performs stateful routing which solves the storage node island problem. Moreover, it exploits the negligibly small amount of content shared among chunks from different file types to create groups of files and deduplicate each group in their own fingerprint index space. It implements hierarchical deduplication to reduce the size of fingerprint indexes at the global level, where only files and big sized segments are deduplicated. Locality is created and exploited first using the big sized segments deduplicated at the global level and second by routing a set of consecutive files together to one storage node. Furthermore, the use of bloom filter for similarity detection between streams has low communication and computation cost while it enables to achieve duplicate elimination performance comparable to single node deduplication. dCACH is evaluated using a prototype deployed on a server environment distributed over four separate machines. It is shown to have 10× the speed of Extreme_Binn with a minimal communication overhead, while its duplicate elimination effectiveness is on a par with a single node deduplication system.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2019R1I1A3A01062789)(received by N.Park).
文摘A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is required.As storage capacity increases,additional storage solutions are required.By leveraging deduplication,you can fundamentally solve the cost problem.However,deduplication poses privacy concerns due to the structure itself.In this paper,we point out the privacy infringement problemand propose a new deduplication technique to solve it.In the proposed technique,since the user’s map structure and files are not stored on the server,the file uploader list cannot be obtained through the server’s meta-information analysis,so the user’s privacy is maintained.In addition,the personal identification number(PIN)can be used to solve the file ownership problemand provides advantages such as safety against insider breaches and sniffing attacks.The proposed mechanism required an additional time of approximately 100 ms to add a IDRef to distinguish user-file during typical deduplication,and for smaller file sizes,the time required for additional operations is similar to the operation time,but relatively less time as the file’s capacity grows.
基金This work was supported by ZTE Industry-University-Institute Cooperation Funds under Grant No.2019ZTE03-01National Keystone R&D Program of China under Grant No.2017YFB0803204+3 种基金National Natural Science Foundation of China(NSFC)under Grant No.61671001Guangdong Provincial R&D Key Program under Grant No.2019B010137001Shenzhen Research Programs under Grant Nos.JCYJ20190808155607340,JSGG20170406144032901,JSGG20170824095858416 and JCYJ20170306092030521PCL Future Regional Network Facilities for Large-scale Experiments and Applications under Grant No.PCL2018KP001.
文摘In architecture of cloud storage, the deduplication technology encrypted with theconvergent key is one of the important data compression technologies, which effectively improvesthe utilization of space and bandwidth. To further refine the usage scenarios for varioususer permissions and enhance user’s data security, we propose a blockchain-based differentialauthorized deduplication system. The proposed system optimizes the traditionalProof of Vote (PoV) consensus algorithm and simplifies the existing differential authorizationprocess to realize credible management and dynamic update of authority. Based on thedecentralized property of blockchain, we overcome the centralized single point fault problemof traditional differentially authorized deduplication system. Besides, the operations oflegitimate users are recorded in blocks to ensure the traceability of behaviors.
文摘Cloud computing technology is the culmination of technical advancements in computer networks,hardware and software capabilities that collectively gave rise to computing as a utility.It offers a plethora of utilities to its clients worldwide in a very cost-effective way and this feature is enticing users/companies to migrate their infrastructure to cloud platform.Swayed by its gigantic capacity and easy access clients are uploading replicated data on cloud resulting in an unnecessary crunch of storage in datacenters.Many data compression techniques came to rescue but none could serve the purpose for the capacity as large as a cloud,hence,researches were made to de-duplicate the data and harvest the space from exiting storage capacity which was going in vain due to duplicacy of data.For providing better cloud services through scalable provisioning of resources,interoperability has brought many Cloud Service Providers(CSPs)under one umbrella and termed it as Cloud Federation.Many policies have been devised for private and public cloud deployment models for searching/eradicating replicated copies using hashing techniques.Whereas the exploration for duplicate copies is not restricted to any one type of CSP but to a set of public or private CSPs contributing to the federation.It was found that even in advanced deduplication techniques for federated clouds,due to the different nature of CSPs,a single file is stored at private as well as public group in the same cloud federation which can be handled if an optimized deduplication strategy be rendered for addressing this issue.Therefore,this study has been aimed to further optimize a deduplication strategy for federated cloud environment and suggested a central management agent for the federation.It was perceived that work relevant to this is not existing,hence,in this paper,the concept of federation agent has been implemented and deduplication technique following file level has been used for the accomplishment of this approach.