期刊文献+
共找到266篇文章
< 1 2 14 >
每页显示 20 50 100
Threat Model and Defense Scheme for Side-Channel Attacks in Client-Side Deduplication 被引量:2
1
作者 Guanxiong Ha Hang Chen +1 位作者 Chunfu Jia Mingyue Li 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2023年第1期1-12,共12页
In cloud storage,client-side deduplication is widely used to reduce storage and communication costs.In client-side deduplication,if the cloud server detects that the user’s outsourced data have been stored,then clien... In cloud storage,client-side deduplication is widely used to reduce storage and communication costs.In client-side deduplication,if the cloud server detects that the user’s outsourced data have been stored,then clients will not need to reupload the data.However,the information on whether data need to be uploaded can be used as a side-channel,which can consequently be exploited by adversaries to compromise data privacy.In this paper,we propose a new threat model against side-channel attacks.Different from existing schemes,the adversary could learn the approximate ratio of stored chunks to unstored chunks in outsourced files,and this ratio will affect the probability that the adversary compromises the data privacy through side-channel attacks.Under this threat model,we design two defense schemes to minimize privacy leakage,both of which design interaction protocols between clients and the server during deduplication checks to reduce the probability that the adversary compromises data privacy.We analyze the security of our schemes,and evaluate their performances based on a real-world dataset.Compared with existing schemes,our schemes can better mitigate data privacy leakage and have a slightly lower communication cost. 展开更多
关键词 cloud storage deduplication side-channel PRIVACY
原文传递
Public Auditing for Encrypted Data with Client-Side Deduplication in Cloud Storage 被引量:4
2
作者 HE Kai HUANG Chuanhe +3 位作者 ZHOU Hao SHI Jiaoli WANG Xiaomao DAN Feng 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2015年第4期291-298,共8页
Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects... Storage auditing and client-side deduplication techniques have been proposed to assure data integrity and improve storage efficiency, respectively. Recently, a few schemes start to consider these two different aspects together. However, these schemes either only support plaintext data file or have been proved insecure. In this paper, we propose a public auditing scheme for cloud storage systems, in which deduplication of encrypted data and data integrity checking can be achieved within the same framework. The cloud server can correctly check the ownership for new owners and the auditor can correctly check the integrity of deduplicated data. Our scheme supports deduplication of encrypted data by using the method of proxy re-encryption and also achieves deduplication of data tags by aggregating the tags from different owners. The analysis and experiment results show that our scheme is provably secure and efficient. 展开更多
关键词 public auditing data integrity storage deduplication cloud storage
原文传递
Homogeneous Batch Memory Deduplication Using Clustering of Virtual Machines
3
作者 N.Jagadeeswari V.Mohan Raj 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期929-943,共15页
Virtualization is the backbone of cloud computing,which is a developing and widely used paradigm.Byfinding and merging identical memory pages,memory deduplication improves memory efficiency in virtualized systems.Kern... Virtualization is the backbone of cloud computing,which is a developing and widely used paradigm.Byfinding and merging identical memory pages,memory deduplication improves memory efficiency in virtualized systems.Kernel Same Page Merging(KSM)is a Linux service for memory pages sharing in virtualized environments.Memory deduplication is vulnerable to a memory disclosure attack,which uses covert channel establishment to reveal the contents of other colocated virtual machines.To avoid a memory disclosure attack,sharing of identical pages within a single user’s virtual machine is permitted,but sharing of contents between different users is forbidden.In our proposed approach,virtual machines with similar operating systems of active domains in a node are recognised and organised into a homogenous batch,with memory deduplication performed inside that batch,to improve the memory pages sharing efficiency.When compared to memory deduplication applied to the entire host,implementation details demonstrate a significant increase in the number of pages shared when memory deduplication applied batch-wise and CPU(Central processing unit)consumption also increased. 展开更多
关键词 Kernel same page merging memory deduplication virtual machine sharing content-based sharing
下载PDF
Hash-Indexing Block-Based Deduplication Algorithm for Reducing Storage in the Cloud
4
作者 D.Viji S.Revathy 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期27-42,共16页
Cloud storage is essential for managing user data to store and retrieve from the distributed data centre.The storage service is distributed as pay a service for accessing the size to collect the data.Due to the massiv... Cloud storage is essential for managing user data to store and retrieve from the distributed data centre.The storage service is distributed as pay a service for accessing the size to collect the data.Due to the massive amount of data stored in the data centre containing similar information and file structures remaining in multi-copy,duplication leads to increase storage space.The potential deduplication system doesn’t make efficient data reduction because of inaccuracy in finding similar data analysis.It creates a complex nature to increase the storage consumption under cost.To resolve this problem,this paper proposes an efficient storage reduction called Hash-Indexing Block-based Deduplication(HIBD)based on Segmented Bind Linkage(SBL)Methods for reducing storage in a cloud environment.Initially,preprocessing is done using the sparse augmentation technique.Further,the preprocessed files are segmented into blocks to make Hash-Index.The block of the contents is compared with other files through Semantic Content Source Deduplication(SCSD),which identifies the similar content presence between the file.Based on the content presence count,the Distance Vector Weightage Correlation(DVWC)estimates the document similarity weight,and related files are grouped into a cluster.Finally,the segmented bind linkage compares the document to find duplicate content in the cluster using similarity weight based on the coefficient match case.This implementation helps identify the data redundancy efficiently and reduces the service cost in distributed cloud storage. 展开更多
关键词 Cloud computing deduplication hash indexing relational content analysis document clustering cloud storage record linkage
下载PDF
Health Data Deduplication Using Window Chunking-Signature Encryption in Cloud
5
作者 G.Neelamegam P.Marikkannu 《Intelligent Automation & Soft Computing》 SCIE 2023年第4期1079-1093,共15页
Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data ... Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data is stored in the cloud-fog storage environments.This cloud-Fog based health model allows the users to get health-related data from different sources,and duplicated informa-tion is also available in the background.Therefore,it requires an additional sto-rage area,increase in data acquisition time,and insecure data replication in the environment.This paper is proposed to eliminate the de-duplication data using a window size chunking algorithm with a biased sampling-based bloomfilter and provide the health data security using the Advanced Signature-Based Encryp-tion(ASE)algorithm in the Fog-Cloud Environment(WCA-BF+ASE).This WCA-BF+ASE eliminates the duplicate copy of the data and minimizes its sto-rage space and maintenance cost.The data is also stored in an efficient and in a highly secured manner.The security level in the cloud storage environment Win-dows Chunking Algorithm(WSCA)has got 86.5%,two thresholds two divisors(TTTD)80%,Ordinal in Python(ORD)84.4%,Boom Filter(BF)82%,and the proposed work has got better security storage of 97%.And also,after applying the de-duplication process,the proposed method WCA-BF+ASE has required only less storage space for variousfile sizes of 10 KB for 200,400 MB has taken only 22 KB,and 600 MB has required 35 KB,800 MB has consumed only 38 KB,1000 MB has taken 40 KB of storage spaces. 展开更多
关键词 Health data ENCRYPTION chunks CLOUD FOG deduplication bloomfilter Algorithm 3:Generation of Key
下载PDF
Using multi-threads to hide deduplication I/O latency with low synchronization overhead 被引量:1
6
作者 朱锐 秦磊华 +1 位作者 周敬利 郑寰 《Journal of Central South University》 SCIE EI CAS 2013年第6期1582-1591,共10页
Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the C... Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods. 展开更多
关键词 MULTI-THREAD MULTI-CORE parallel data deduplication
下载PDF
Secured Data Storage Using Deduplication in Cloud Computing Based on Elliptic Curve Cryptography 被引量:1
7
作者 N.Niyaz Ahamed N.Duraipandian 《Computer Systems Science & Engineering》 SCIE EI 2022年第4期83-94,共12页
The tremendous development of cloud computing with related technol-ogies is an unexpected one.However,centralized cloud storage faces few chal-lenges such as latency,storage,and packet drop in the network.Cloud storag... The tremendous development of cloud computing with related technol-ogies is an unexpected one.However,centralized cloud storage faces few chal-lenges such as latency,storage,and packet drop in the network.Cloud storage gets more attention due to its huge data storage and ensures the security of secret information.Most of the developments in cloud storage have been positive except better cost model and effectiveness,but still data leakage in security are billion-dollar questions to consumers.Traditional data security techniques are usually based on cryptographic methods,but these approaches may not be able to with-stand an attack from the cloud server's interior.So,we suggest a model called multi-layer storage(MLS)based on security using elliptical curve cryptography(ECC).The suggested model focuses on the significance of cloud storage along with data protection and removing duplicates at the initial level.Based on divide and combine methodologies,the data are divided into three parts.Here,thefirst two portions of data are stored in the local system and fog nodes to secure the data using the encoding and decoding technique.The other part of the encrypted data is saved in the cloud.The viability of our model has been tested by research in terms of safety measures and test evaluation,and it is truly a powerful comple-ment to existing methods in cloud storage. 展开更多
关键词 Cloud storage deduplication fog computing and elliptic curve cryptography
下载PDF
SRSC: Improving Restore Performance for Deduplication-Based Storage Systems
8
作者 ZUO Chunxue WANG Fang +2 位作者 TANG Xiaolan ZHANG Yucheng FENG Dan 《ZTE Communications》 2019年第2期59-66,共8页
Modern backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication.Fragmentation degrades the restore performance because of res... Modern backup systems exploit data deduplication technology to save stor-age space whereas suffering from the fragmentation problem caused by deduplication.Fragmentation degrades the restore performance because of restoring the chunks thatare scattered all over different containers. To improve the restore performance, thestate-of-the-art History Aware Rewriting Algorithm(HAR) is proposed to collect frag-mented chunks in the last backup and rewrite them in the next backup. However, dueto rewriting fragmented chunks in the next backup, HAR fails to eliminate internalfragmentation caused by self-referenced chunks(that exist more than two times in abackup) in the current backup, thus degrading the restore performance. In this paper,we propose Selectively Rewriting Self-Referenced Chunks(SRSC), a scheme that de-signs a buffer to simulate a restore cache, identify internal fragmentation in the cacheand selectively rewrite them. Our experimental results based on two real-world datas-ets show that SRSC improves the restore performance by 45% with an acceptable sac-rifice of the deduplication ratio. 展开更多
关键词 DATA deduplication FRAGMENTATION RESTORE PERFORMANCE
下载PDF
dCACH: Content Aware Clustered and Hierarchical Distributed Deduplication
9
作者 Girum Dagnaw Ke Zhou Hua Wang 《Journal of Software Engineering and Applications》 2019年第11期460-490,共31页
In deduplication, index-lookup disk bottleneck is a major obstacle which limits the throughput of backup processes. One way to minimize the effect of this issue and boost speed is to use very high course-grained chunk... In deduplication, index-lookup disk bottleneck is a major obstacle which limits the throughput of backup processes. One way to minimize the effect of this issue and boost speed is to use very high course-grained chunks for deduplication at a cost of low storage saving and limited scalability. Another way is to distribute the deduplication process among multiple nodes but this approach introduces storage node island effect and also incurs high communication cost. In this paper, we explore dCACH, a content-aware clustered and hierarchical deduplication system, which implements a hybrid of inline course grained and offline fine-grained distributed deduplication where routing decisions are made for a set of files instead of single files. It utilizes bloom filters for detecting similarity between a data stream and previous data streams and performs stateful routing which solves the storage node island problem. Moreover, it exploits the negligibly small amount of content shared among chunks from different file types to create groups of files and deduplicate each group in their own fingerprint index space. It implements hierarchical deduplication to reduce the size of fingerprint indexes at the global level, where only files and big sized segments are deduplicated. Locality is created and exploited first using the big sized segments deduplicated at the global level and second by routing a set of consecutive files together to one storage node. Furthermore, the use of bloom filter for similarity detection between streams has low communication and computation cost while it enables to achieve duplicate elimination performance comparable to single node deduplication. dCACH is evaluated using a prototype deployed on a server environment distributed over four separate machines. It is shown to have 10× the speed of Extreme_Binn with a minimal communication overhead, while its duplicate elimination effectiveness is on a par with a single node deduplication system. 展开更多
关键词 Clustered deduplication Content Aware GROUPING HIERARCHICAL deduplication Stateful Routing SIMILARITY BLOOM FILTERS
下载PDF
Privacy-Enhanced Data Deduplication Computational Intelligence Technique for Secure Healthcare Applications
10
作者 Jinsu Kim Sungwook Ryu Namje Park 《Computers, Materials & Continua》 SCIE EI 2022年第2期4169-4184,共16页
A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is requi... A significant number of cloud storage environments are already implementing deduplication technology.Due to the nature of the cloud environment,a storage server capable of accommodating large-capacity storage is required.As storage capacity increases,additional storage solutions are required.By leveraging deduplication,you can fundamentally solve the cost problem.However,deduplication poses privacy concerns due to the structure itself.In this paper,we point out the privacy infringement problemand propose a new deduplication technique to solve it.In the proposed technique,since the user’s map structure and files are not stored on the server,the file uploader list cannot be obtained through the server’s meta-information analysis,so the user’s privacy is maintained.In addition,the personal identification number(PIN)can be used to solve the file ownership problemand provides advantages such as safety against insider breaches and sniffing attacks.The proposed mechanism required an additional time of approximately 100 ms to add a IDRef to distinguish user-file during typical deduplication,and for smaller file sizes,the time required for additional operations is similar to the operation time,but relatively less time as the file’s capacity grows. 展开更多
关键词 Computational intelligence CLOUD MULTIMEDIA data deduplication
下载PDF
Differentially Authorized Deduplication System Based on Blockchain
11
作者 ZHAO Tian LI Hui +4 位作者 YANG Xin WANG Han ZENG Ming GUO Haisheng WANG Dezheng 《ZTE Communications》 2021年第2期67-76,共10页
In architecture of cloud storage, the deduplication technology encrypted with theconvergent key is one of the important data compression technologies, which effectively improvesthe utilization of space and bandwidth. ... In architecture of cloud storage, the deduplication technology encrypted with theconvergent key is one of the important data compression technologies, which effectively improvesthe utilization of space and bandwidth. To further refine the usage scenarios for varioususer permissions and enhance user’s data security, we propose a blockchain-based differentialauthorized deduplication system. The proposed system optimizes the traditionalProof of Vote (PoV) consensus algorithm and simplifies the existing differential authorizationprocess to realize credible management and dynamic update of authority. Based on thedecentralized property of blockchain, we overcome the centralized single point fault problemof traditional differentially authorized deduplication system. Besides, the operations oflegitimate users are recorded in blocks to ensure the traceability of behaviors. 展开更多
关键词 convergent key deduplication blockchain differential authorization
下载PDF
Implementation and Validation of the Optimized Deduplication Strategy in Federated Cloud Environment
12
作者 Nipun Chhabra Manju Bala Vrajesh Sharma 《Computers, Materials & Continua》 SCIE EI 2022年第4期2019-2035,共17页
Cloud computing technology is the culmination of technical advancements in computer networks,hardware and software capabilities that collectively gave rise to computing as a utility.It offers a plethora of utilities t... Cloud computing technology is the culmination of technical advancements in computer networks,hardware and software capabilities that collectively gave rise to computing as a utility.It offers a plethora of utilities to its clients worldwide in a very cost-effective way and this feature is enticing users/companies to migrate their infrastructure to cloud platform.Swayed by its gigantic capacity and easy access clients are uploading replicated data on cloud resulting in an unnecessary crunch of storage in datacenters.Many data compression techniques came to rescue but none could serve the purpose for the capacity as large as a cloud,hence,researches were made to de-duplicate the data and harvest the space from exiting storage capacity which was going in vain due to duplicacy of data.For providing better cloud services through scalable provisioning of resources,interoperability has brought many Cloud Service Providers(CSPs)under one umbrella and termed it as Cloud Federation.Many policies have been devised for private and public cloud deployment models for searching/eradicating replicated copies using hashing techniques.Whereas the exploration for duplicate copies is not restricted to any one type of CSP but to a set of public or private CSPs contributing to the federation.It was found that even in advanced deduplication techniques for federated clouds,due to the different nature of CSPs,a single file is stored at private as well as public group in the same cloud federation which can be handled if an optimized deduplication strategy be rendered for addressing this issue.Therefore,this study has been aimed to further optimize a deduplication strategy for federated cloud environment and suggested a central management agent for the federation.It was perceived that work relevant to this is not existing,hence,in this paper,the concept of federation agent has been implemented and deduplication technique following file level has been used for the accomplishment of this approach. 展开更多
关键词 Federation agent deduplication in federated cloud central management agent for cloud federation interoperability in cloud computing bloom filters cloud computing cloud data storage
下载PDF
自定义密钥的加密去重云存储访问控制方案
13
作者 谷博伦 徐子凯 +1 位作者 李卫海 俞能海 《网络与信息安全学报》 2024年第4期85-97,共13页
随着互联网的高速发展和应用,传统的存储资源难以满足日益增长的海量数据存储的需求,越来越多的用户尝试将他们的数据上传到第三方云服务器进行统一存储。如何在云上同时实现高效的加密重复数据删除和安全的文件共享已成为迫切需要解决... 随着互联网的高速发展和应用,传统的存储资源难以满足日益增长的海量数据存储的需求,越来越多的用户尝试将他们的数据上传到第三方云服务器进行统一存储。如何在云上同时实现高效的加密重复数据删除和安全的文件共享已成为迫切需要解决的问题。并且,用户总是喜欢自己设定密码对文件进行加解密,而在有需要时才对加密文件进行共享。据此,设计了一个确定性分步加密算法,当两步加密的密钥满足一定关系时,两步加密可以等效为一次加密。在此基础上提出了一个支持动态访问控制的云存储加密重复数据删除方案,使用确定性分步加密算法对文件进行加密,使用密文策略的属性基加密算法对文件密钥进行加密。这不仅使持有相同文件的不同用户均可以灵活地自定义各自的加解密密钥,而且还可以通过动态的访问控制机制确保安全的文件共享。此外,访问控制部分能够兼容大多数现有的密文策略属性基加密方案,甚至允许在不同的属性组中使用不同的密文策略属性基加密方案。安全性分析结果显示,该方案能够达到当前加密去重范式下的最高安全性。实验和分析结果表明,该方案能够较好地满足云服务商和用户的实际需求,具有良好的算法执行效率。 展开更多
关键词 加密去重 自定义密钥 访问控制 分步椭圆曲线加密 确定性改进型最优非对称加密填充
下载PDF
基于区块链智能合约的异构服务器安全去重方法
14
作者 江粼 李嘉兴 武继刚 《郑州大学学报(工学版)》 CAS 北大核心 2024年第5期95-102,142,共9页
针对大数据时代用户数据在云服务器存储中面临的可靠性提升与重复数据删除策略之间的冲突,提出了一种基于区块链智能合约的异构服务器数据安全去重方法,利用区块链的去中心化、不可篡改和公开透明等特性,以及智能合约的自动化执行能力,... 针对大数据时代用户数据在云服务器存储中面临的可靠性提升与重复数据删除策略之间的冲突,提出了一种基于区块链智能合约的异构服务器数据安全去重方法,利用区块链的去中心化、不可篡改和公开透明等特性,以及智能合约的自动化执行能力,实现了数据存储的安全性、可靠性和隐私保护。具体而言,方法结合了秘密共享和区块链智能合约技术,设计了安全高效的云存储数据去重服务。同时,通过区块链取代集中式第三方实体的功能,消除了潜在的安全隐患,并通过智能合约脚本缓解了服务器之间的异构性。实验结果表明:研究方法在相同文件大小、不同文件块数量的情况下的平均计算开销比对比方法低65.42%~115.77%,平均储存开销降低7.94%~19.50%。同时,在不同异构存储服务器数量下,平均计算开销与存储开销分别降低了67.27%~177.89%、34.01%~72.89%。研究方法在安全性、计算开销及存储开销方面优于现有的两个基于区块链的数据去重方法。 展开更多
关键词 区块链 云存储 智能合约 秘密共享方法 数据去重 安全性
下载PDF
基于聚类分析法的织造车间能耗数据清洗
15
作者 黄启航 汝欣 +3 位作者 戴宁 俞博 陈炜 徐郁山 《软件工程》 2024年第7期22-27,共6页
针对织造车间数据采集过程中存在的数据质量低、数据冗余高的问题,提出了一种基于聚类分析法的综合数据清洗方法。首先,对纺织企业车间能耗进行层级分析,针对异常数据提出了基于二分K-means算法的异常数据识别方法。其次,针对缺失数据,... 针对织造车间数据采集过程中存在的数据质量低、数据冗余高的问题,提出了一种基于聚类分析法的综合数据清洗方法。首先,对纺织企业车间能耗进行层级分析,针对异常数据提出了基于二分K-means算法的异常数据识别方法。其次,针对缺失数据,采用多样化数据插补办法,实现对不同特征数据的插补;针对数据冗余高的问题,引入可决系数对数据集进行去重,降低数据集冗余。最后,以某纺织企业车间运行数据为对象进行仿真实验,结果表明,经降重后,数据集的数据量降低了83%,数据集预测实验的平均绝对百分比误差波动范围小于2%,该方法在降低数据冗余的同时保证了预测的可靠性。 展开更多
关键词 数据清洗 聚类 异常检测 去重
下载PDF
基于特征迭代的电力物资供应链数据去重研究 被引量:1
16
作者 王艳艳 金义 +1 位作者 钱诚 许晓艺 《微型电脑应用》 2024年第4期144-148,共5页
现有的电力物资供应链数据去重方法均出现去重不完全或删除正常数据的情况,为了加强数据去重效率,有效提高去重性能,提出基于特征迭代的电力物资供应链数据去重研究方法。该方法在特征迭代的帮助下对电力物资供应链数据展开特征提取以... 现有的电力物资供应链数据去重方法均出现去重不完全或删除正常数据的情况,为了加强数据去重效率,有效提高去重性能,提出基于特征迭代的电力物资供应链数据去重研究方法。该方法在特征迭代的帮助下对电力物资供应链数据展开特征提取以及特征分类的预处理,提前简化数据量,降低去重难度和计算量,计算预处理后的数据之间的相似度,利用Counting Bloom Filter算法,计算出符合删除操作的相似度数据,并对其删除,实现电力物资供应链数据去重。实验结果表明,所提方法的存储空间使用量小、去重能力好以及数据去重所需时间短。 展开更多
关键词 特征迭代 预处理 数据去重 相似度计算 特征提取
下载PDF
基于模糊广义去重的图像轻量安全云存储方法
17
作者 陈海欣 唐鑫 +2 位作者 金路超 付耀文 周艺腾 《应用科学学报》 CAS CSCD 北大核心 2024年第5期769-781,共13页
广义去重是实现云数据安全去重的一种重要手段。现有的广义去重方法仅支持精确去重,且无法与图像加密技术有机结合。而图像加密技术本身也将给用户带来巨大的计算开销。针对以上挑战,本文提出一种基于模糊广义去重的图像轻量级安全云存... 广义去重是实现云数据安全去重的一种重要手段。现有的广义去重方法仅支持精确去重,且无法与图像加密技术有机结合。而图像加密技术本身也将给用户带来巨大的计算开销。针对以上挑战,本文提出一种基于模糊广义去重的图像轻量级安全云存储方法。首先对图像数据开展整数小波变换并提取低频分量作为基,高频分量作为偏移量,通过提出一种基于异或的轻量级加密算法,把图像的机密性保护方法与广义去重技术有机结合。此外,本文还对偏移量进行云端模糊去重,使得云端仅保存高度相似的偏移量数据的单个副本,实现了图像云数据的模糊广义去重。在相关的图像数据集上开展实验,结果表明在实现安全性的前提下,本文所提方法在改善通信效率和存储效率上有显著效果。 展开更多
关键词 图像去重 侧信道攻击 云存储 模糊去重 图像加密
下载PDF
面向去中心化存储的数据流行度去重模型
18
作者 汪彩梅 闻琪略 +3 位作者 周子健 卢建豪 张琛 吴志泽 《计算机应用研究》 CSCD 北大核心 2024年第5期1544-1553,共10页
数据流行度去重方案中存在检测机构不诚实、数据存储不可靠等问题,提出一种面向去中心化存储的数据流行度去重模型。针对检测机构不诚实,模型结合区块链的不可窜改性与智能合约的不可抵赖性,将智能合约作为检测机构执行数据的重复性检... 数据流行度去重方案中存在检测机构不诚实、数据存储不可靠等问题,提出一种面向去中心化存储的数据流行度去重模型。针对检测机构不诚实,模型结合区块链的不可窜改性与智能合约的不可抵赖性,将智能合约作为检测机构执行数据的重复性检测和流行度检测,保障了检测结果的真实性。针对数据存储不可靠问题,提出一种文件链存储结构。该结构满足数据流行度去重的要求,并通过添加辅助信息的方式,建立分布在不同存储节点中实现物理/逻辑上传的分片之间的逻辑关系,为流行度数据去中心化网络存储提供基础;同时,在数据块信息中添加备份标识,借助备份标识将存储网络划分为两个虚拟存储空间,分别实现数据和备份数据的检测与存储,满足了用户备份需求。安全性分析和性能分析表明,该方案具有可行性,保障了检测结果的真实性,并提高了数据存储的可靠性。 展开更多
关键词 数据去重 数据流行度 去中心化 区块链 存储可靠性
下载PDF
基于聚类分析和缓存优化的数据去重策略
19
作者 裴世豪 刘颖 +1 位作者 李佳阳 郝欣哲 《计算机应用文摘》 2024年第18期114-118,共5页
针对传统去重算法无法兼顾去重率和吞吐量的问题,设计了一种相似聚类重复数据删除算法。该算法基于数据相似性理论定义数据项间的相似度,将相似的数据项进行分类和标记,然后在缓存中保留每个聚类中的部分特征数据。当新数据录入时,算法... 针对传统去重算法无法兼顾去重率和吞吐量的问题,设计了一种相似聚类重复数据删除算法。该算法基于数据相似性理论定义数据项间的相似度,将相似的数据项进行分类和标记,然后在缓存中保留每个聚类中的部分特征数据。当新数据录入时,算法根据数据特征选择合适的聚类进行数据去重。此外,为高效利用有限的缓存,提出了一种基于随机森林算法的缓存优化方法,用于优化去重过程中使用的指纹缓存,以提高缓存指纹的命中率。该缓存模型基于传统的随机森林分类器,并使用改进的烟花算法(ELU函数优化)对随机森林的超参数进行优化,能够有效应对数据量过大、指纹过多以及缓存利用有限的问题。实验验证表明,与基于数据相似原理的RMD和Shingle方法相比,所提算法在去重率和吞吐量方面均提高了10%~15%。 展开更多
关键词 数据去重 灾难备份 数据相似性 烟花算法 随机森林
下载PDF
面向云存储的数据流行度去重方案
20
作者 何欣枫 杨琴琴 《西安电子科技大学学报》 EI CAS CSCD 北大核心 2024年第1期187-200,共14页
随着云计算的发展,企业和个人倾向于把数据外包给云存储服务器来缓解本地存储压力,导致云端存储压力成为一个日益突出的问题。为了提高云存储效率,降低通信成本,数据去重技术得到了广泛应用。现有的数据去重技术主要包括基于哈希表的相... 随着云计算的发展,企业和个人倾向于把数据外包给云存储服务器来缓解本地存储压力,导致云端存储压力成为一个日益突出的问题。为了提高云存储效率,降低通信成本,数据去重技术得到了广泛应用。现有的数据去重技术主要包括基于哈希表的相同数据去重和基于布隆过滤器的相似数据去重,但都很少考虑数据流行度的影响。实际应用中,用户外包给云服务器的数据分布是不均匀的,根据访问频率可以划分为流行数据和非流行数据。流行数据访问频繁,在云服务器中会存在大量的副本和相似数据,需要执行高精度的数据去重;而非流行数据访问频率低,云存储服务器中的副本数量和相似数据较少,低精度的去重即可满足要求。针对上述问题,将数据流行度和布隆过滤器相结合,提出一种基于数据流行度的动态布隆过滤器;同时,提出一种基于数据流行度的动态布隆过滤器的数据去重方案,可以根据数据流行度动态调整去重精度。仿真结果表明,该方案在时间消耗、空间消耗和误判率之间取得了良好的平衡。 展开更多
关键词 云计算 云存储 数据去重 数据流行度 布隆过滤器
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部