期刊文献+

基于云存储的重复数据删除架构的研究与设计 被引量:6

Research and Design of Data De-duplication Architecture Based on Cloud Storage
下载PDF
导出
摘要 随着云计算的发展,云存储技术通过集群应用、虚拟化技术、分布式文件系统等功能将网络中大量各种不同类型的存储设备集合起来协同工作,缓解了老式数据中心的存储压力.另外,重复数据删除技术是一种缩减存储空间减少网络传输量的技术,随着云的广泛应用也势必会发展应用于云存储中.这两种技术结合将会给IT存储业带来实际效益.本文通过研究重复数据删除技术、云存储技术,设计了基于云存储的重复数据删除架构,提出了一种用In-line方式在客户端进行数据块级与字节级相结合的重复数据删除操作后再将数据存入云中的方案.在本架构下,海量数据存储在HDFS中;而文件数据块的哈希值存储在HBase中. With the development of cloud computing, the cloud storage technology gets a large variety of different types of network storage devices together to work collaboratively by clustering applications, virtualization, Distributed File System, alleviating the pressure of old data center storage. Besides, Data De-duplication is a technology that reduces storage space and lowers the network transmission. And it is going to be adaptable for cloud storage system one day. The combination of these two technologies will bring real benefits to IT storage industry. The paper has designed a de-duplication architecture based on cloud storage, proposed a scheme which runs at the client with In-line manner to eliminate duplicated data in chunk level, and then put those data into cloud. Under this architecture, HDFS stores the mass data while HBase stores hash value of data block.
出处 《计算机系统应用》 2013年第1期208-211,共4页 Computer Systems & Applications
关键词 重复数据删除技术 云存储 hash值 HDFS HBASE data de-duplication technology cloud storage hash value HDFS HBase
  • 相关文献

参考文献11

  • 1敖莉,舒继武,李明强.重复数据删除技术[J].软件学报,2010,21(5):916-929. 被引量:119
  • 2武永卫 黄小猛.云存储.中国计算机学会通讯,2009,5(6):44-52.
  • 3Wu JY, Ping LD, Ge XP, Wang Y, Fu JQ. Cloud Storage as the Infrastructure of Cloud Computing. Intelligent Computing and Cognitive Informatics(ICICCI), 2010:380-383.
  • 4中国云计算.什么是云存储.[2010-08-25].http://wwv,:doudcomputing-china.cn/Article/luilan/200811/215.html.
  • 5云存储架构详解.http://www.cloudeomputing-china.cn/Artiele/luilan/201003/564.html.
  • 6Zaffos AWCS. Cloud Storage:Benefits, Risks and Cost Considerations. Gartner, Apr. 2009.
  • 7Armbrust M, et al. Above the Clouds:: A Berkeley View of Cloud Computing. tech. report UCB/EECS-2009-28, Electrical Eng. and Computer Science Dept., Univ. Calif.' Berkeley, 2009; www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.ht- ml.
  • 8Hamik D, Pinkas B. Cloud Computing Side Channels in Cloud Services Deduplication in Cloud Storage. Copu- blished by the IEEE computer and reliability societies,2010.
  • 9Kim C, Soo K, Ho K. Rethinking Deduplication in Cloud: From Data Profiling To Blueprint. Networked Computing and Advanced Information Management(NCM), 2011 7th International Conference on Publication Year: 2011:101-104.
  • 10Sun Z, Shen J, Yong J. DeDu:Building a Deduplication Storage System over Cloud Computing. Proc. of the 2011 15th International Conference on Computer Supported Cooperative Work in Design: IEEE 2011,348-355.

二级参考文献58

  • 1Bhagwat D,Pollack K,Long DDE,Schwarz T,Miller EL,P-ris JF.Providing high reliability in a minimum redundancy archival storage system.In:Proc.of the 14th Int'l Symp.on Modeling,Analysis,and Simulation of Computer and Telecommunication Systems (MASCOTS 2006).Washington:IEEE Computer Society Press,2006.413-421.
  • 2Zhu B,Li K.Avoiding the disk bottleneck in the data domain deduplication file system.In:Proc.of the 6th Usenix Conf.on File and Storage Technologies (FAST 2008).Berkeley:USENIX Association,2008.269-282.
  • 3Bhagwat D,Eshghi K,Mehra P.Content-Based document routing and index partitioning for scalable similarity-based searches in a large corpus.In:Berkhin P,Caruana R,Wu XD,Gaffney S,eds.Proc.of the 13th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining (KDD 2007).New York:ACM Press,2007.105-112.
  • 4You LL,Pollack KT,Long DDE.Deep store:An archival storage system architecture.In:Proc.of the 21st Int'l Conf.on Data Engineering (ICDE 2005).Washington:IEEE Computer Society Press,2005.804-815.
  • 5Quinlan S,Dorward S.Venti:A new approach to archival storage.In:Proc.of the 1st Usenix Conf.on File and Storage Technologies (FAST 2002).Berkeley:USENIX Association,2002.89-102.
  • 6Sapuntzakis CP,Chandra R,Pfaff B,Chow J,Lam MS,Rosenblum M.Optimizing the migration of virtual computers.In:Proc.of the 5th Symp.on Operating Systems Design and Implementation (OSDI 2002).New York:ACM Press,2002.377-390.
  • 7Rabin MO.Fingerprinting by random polynomials.Technical Report,CRCT TR-15-81,Harvard University,1981.
  • 8Rivest R.The MD5 message-digest algorithm.1992.http://www.python.org/doc/current/lib/module-md5.html.
  • 9U.S.National Institute of Standards and Technology (NIST).Federal Information Processing Standards (FIPS) Publication 180-1:Secure Hash Standard.1995.http://www.itl.nist.gov/fipspubs/fip180-1.htm.
  • 10U.S.National Institute of Standards and Technology (NIST).Federal Information Processing Standards (FIPS) Publication 180-2:Secure Hash Standard.2002.http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf.

共引文献151

同被引文献40

  • 1王瑶.电力设备在线监测技术的研究与发展[J].中国科技期刊数据库 科研,2016(11):170-170. 被引量:1
  • 2邵德军,尹项根,李彦武,胡文平,王志华.电气设备在线监测数据采集系统中大容量数据存储的实现[J].继电器,2004,32(12):68-70. 被引量:13
  • 3Cook G, Horn J V. How dirty is your data?:A look at the ener- gy choices that power cloud computing[ R]. [ s. 1. ] :[ s. n. ], 2011.
  • 4Qureshi A. Power-demand routing in massive geo-distributed systems [ D ]. Massachusetts : Massachusetts Institute of Tech- nology,2010.
  • 5Gao P X, Curtis A P, Wong B,et al. It' s not easy being green [ C]//Proc of the ACM SIGCOMM 2012. Helsinki, Finland: [ s. n. ] ,2012:211-222.
  • 6徐笑字,黄磊虚拟化技术在高校信息化建设中的探讨[J].北京师范大学学掇(自然科学版),2011,34(4):818-822.
  • 7Murtazaev,Aziz;Sangyoon Oh.Sercon:ServerConsolidation AI gorithm using Live Migration of Virtual Machines for Green Co-mputing[J].lETE Technical Review, 2011,28(3):212-231.
  • 8比特网.文件级别和数据块级别重复数据删除的差异[OL].http://Storage.chinabyte.com/80/11351580.shtml,2010/5.
  • 9徐笑宇,黄磊.虚拟化技术在高校信息化建设中的探讨[J].西南民族大学学报(自然科学版),2008,34(4):818-822. 被引量:14
  • 10张龙立.云存储技术探讨[J].电信科学,2010,26(S1):71-74. 被引量:54

引证文献6

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部