摘要
互联网大数据蓬勃发展,各个行业都围绕着大数据展开研究。与此同时,由于数据量的异常膨胀,随之而来的问题就是如何回收垃圾数据。基于云存储日志文件系统HLFS(Hadoop distributed file system based Log-structured File System),设计与实现了垃圾数据回收子系统。通过在HLFS中添加垃圾回收子系统,不但可以提高数据空间的利用率,还可以有效地避免数据空间不够用。为了分析HLFS中垃圾回收子系统的性能,最后对比了HLFS垃圾回收子系统和其他系统中垃圾数据回收机制的优缺点,从而帮助用户选择合适的垃圾回收机制提高磁盘利用率和系统性能。
With the booming development of network big data, industry of all sectors are carrying out study around it. Meanwhile, becauseof the abnormal data expansion, the ensuing problem is how to retrieve junk data. In this paper, we design and implement a junk data retrievesubsystem based on log-structured file system of cloud storage, HLFS ( Hadoop distributed file system based log-structured file system). Byappending the subsystem to HLFS, not only the utilisation of data space can be enhanced, but the insufficient data space is also effectivelyavoided. In order to analyse the performance of junk data retrieve subsystem in HLFS, in end of the paper we compare the pros and cons ofthis subsystem in HLFS with the junk data retrieve mechanism in other systems, thereby help users to choose proper junk data retrievemechanism to improve disk utilisation and system performance.
作者
贾威威
林奕
张延园
Jia Weiwei;Lin Yi;Zhang Yanyuan(School of Computing Science, Northwestern Poly technical University,XV an 710129,Shaanxi, China)
出处
《计算机应用与软件》
CSCD
2016年第8期53-56,178,共5页
Computer Applications and Software
基金
国家自然科学基金项目(61272123)
关键词
云存储
日志文件系统
垃圾数据回收
Cloud storage
Log-structured file system
Junk data retrieve