期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Enhanced Best Fit Algorithm for Merging Small Files
1
作者 Adnan Ali Nada Masood Mirza Mohamad Khairi Ishak 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期913-928,共16页
In the Big Data era,numerous sources and environments generate massive amounts of data.This enormous amount of data necessitates specialized advanced tools and procedures that effectively evaluate the information and ... In the Big Data era,numerous sources and environments generate massive amounts of data.This enormous amount of data necessitates specialized advanced tools and procedures that effectively evaluate the information and anticipate decisions for future changes.Hadoop is used to process this kind of data.It is known to handle vast volumes of data more efficiently than tiny amounts,which results in inefficiency in the framework.This study proposes a novel solution to the problem by applying the Enhanced Best Fit Merging algorithm(EBFM)that merges files depending on predefined parameters(type and size).Implementing this algorithm will ensure that the maximum amount of the block size and the generated file size will be in the same range.Its primary goal is to dynamically merge files with the stated criteria based on the file type to guarantee the efficacy and efficiency of the established system.This procedure takes place before the files are available for the Hadoop framework.Additionally,the files generated by the system are named with specific keywords to ensure there is no data loss(file overwrite).The proposed approach guarantees the generation of the fewest possible large files,which reduces the input/output memory burden and corresponds to the Hadoop framework’s effectiveness.The findings show that the proposed technique enhances the framework’s performance by approximately 64%while comparing all other potential performance-impairing variables.The proposed approach is implementable in any environment that uses the Hadoop framework,not limited to smart cities,real-time data analysis,etc. 展开更多
关键词 Big data Hadoop MapReduce small file HDFS
下载PDF
Multi-Level Cache System of Small Spatio-Temporal Data Files Based on Cloud Storage in Smart City
2
作者 XU Xiaolin HU Zhihua LIU Xiaojun 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2017年第5期387-394,共8页
In this paper, we present a distributed multi-level cache system based on cloud storage, which is aimed at the low access efficiency of small spatio-temporal data files in information service system of Smart City. Tak... In this paper, we present a distributed multi-level cache system based on cloud storage, which is aimed at the low access efficiency of small spatio-temporal data files in information service system of Smart City. Taking classification attribute of small spatio-temporal data files in Smart City as the basis of cache content selection, the cache system adopts different cache pool management strategies in different levels of cache. The results of experiment in prototype system indicate that multi-level cache in this paper effectively increases the access bandwidth of small spatio-temporal files in Smart City and greatly improves service quality of multiple concurrent access in system. 展开更多
关键词 Smart City spatio-temporal data multi-level cache small file
原文传递
Fine-grained management of I/O optimizations based on workload characteristics
3
作者 Bing WEI Limin XIAO +3 位作者 Bingyu ZHOU Guangjun QIN Baicheng YAN Zhisheng HUO 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第3期33-46,共14页
With the advent of new computing paradigms,parallel file systems serve not only traditional scientific computing applications but also non-scientific computing applications,such as financial computing,business,and pub... With the advent of new computing paradigms,parallel file systems serve not only traditional scientific computing applications but also non-scientific computing applications,such as financial computing,business,and public administration.Parallel file systems provide storage services for multiple applications.As a result,various requirements need to be met.However,parallel file systems usually provide a unified storage solution,which cannot meet specific application needs.In this paper,an extended tile handle scheme is proposed to deal with this problem.The original file handle is extended to record I/O optimization information,which allows file systems to specify optimizations for a file or directory based on workload characteristics.Therefore,fine-grained management of I/O optimizations can be achieved.On the basis of the extended file handle scheme,data prefetching and small file optimization mechanisms are proposed for parallel file systems.The experimental results show that the proposed approach improves the aggregate throughput of the overall system by up to 189.75%. 展开更多
关键词 parallel file systems workload characteristics extended file handle data prefetching small files
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部