期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
A GPU-Accelerated In-Memory Metadata Management Scheme forLarge-Scale Parallel File Systems 被引量:1
1
作者 Zhi-Guang Chen Yu-Bo Liu +1 位作者 Yong-Feng Wang Yu-Tong Lu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第1期44-55,共12页
Driven by the increasing requirements of high-performance computing applications,supercomputers are prone to containing more and more computing nodes.Applications running on such a large-scale computing system are lik... Driven by the increasing requirements of high-performance computing applications,supercomputers are prone to containing more and more computing nodes.Applications running on such a large-scale computing system are likely to spawn millions of parallel processes,which usually generate a burst of I/O requests,introducing a great challenge into the metadata management of underlying parallel file systems.The traditional method used to overcome such a challenge is adopting multiple metadata servers in the scale-out manner,which will inevitably confront with serious network and consistence problems.This work instead pursues to enhance the metadata performance in the scale-up manner.Specifically,we propose to improve the performance of each individual metadata server by employing GPU to handle metadata requests in parallel.Our proposal designs a novel metadata server architecture,which employs CPU to interact with file system clients,while offloading the computing tasks about metadata into GPU.To take full advantages of the parallelism existing in GPU,we redesign the in-memory data structure for the name space of file systems.The new data structure can perfectly fit to the memory architecture of GPU,and thus helps to exploit the large number of parallel threads within GPU to serve the bursty metadata requests concurrently.We implement a prototype based on BeeGFS and conduct extensive experiments to evaluate our proposal,and the experimental results demonstrate that our GPU-based solution outperforms the CPU-based scheme by more than 50%under typical metadata operations.The superiority is strengthened further on high concurrent scenarios,e.g.,the high-performance computing systems supporting millions of parallel threads. 展开更多
关键词 GPU-accelerated in-memory metadata management parallel file system
原文传递
Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems
2
作者 Quanqing xu Rajesh Vellore ARUMUGAM +3 位作者 Khai Leong YONG Yonggang WEN Yew-Soon ONG Weiya XI 《Frontiers of Computer Science》 SCIE EI CSCD 2015年第6期904-918,共15页
Big data is an emerging term in the storage indus- try, and it is data analytics on big storage, i.e., Cloud-scale storage. In Cloud-scale (or EB-scale) file systems, load bal- ancing in request workloads across a m... Big data is an emerging term in the storage indus- try, and it is data analytics on big storage, i.e., Cloud-scale storage. In Cloud-scale (or EB-scale) file systems, load bal- ancing in request workloads across a metadata server cluster is critical for avoiding performance bottlenecks and improv- ing quality of services. Many good approaches have been pro- posed for load balancing in distributed file systems. Some of them pay attention to global namespace balancing, making metadata distribution across metadata servers as uniform as possible. However, they do not work well in skew request dis- tributions, which impair load balancing but simultaneously increase the effectiveness of caching and replication, in this paper, we propose Cloud Cache (C2), an adaptive and scal- able load balancing scheme for metadata server cluster in EB-scale file systems. It combines adaptive cache diffusion and replication scheme to cope with the request load balanc- ing problem, and it can be integrated into existing distributed metadata management approaches to efficiently improve their load balancing performance. C2 runs as follows: 1) to run adaptive cache diffusion first, if a node is overloaded, load- shedding will be used; otherwise, load-stealing will be used; and 2) to run adaptive replication scheme second, if there is a very popular metadata item (or at least two items) causing a node be overloaded, adaptive replication scheme will be used,in which the very popular item is not split into several nodes using adaptive cache diffusion because of its knapsack prop- erty. By conducting performance evaluation in trace-driven simulations, experimental results demonstrate the efficiency and scalability of C2. 展开更多
关键词 metadata management load balancing adaptivecache diffusion adaptive replication cloud-scale file systems
原文传递
ONFS: a hierarchical hybrid file system based on memory, SSD, and HDD for high performance computers 被引量:1
3
作者 Xin LIU Yu-tong LU +3 位作者 Jie YU Peng-fei WANG Jie-ting WU Ying LU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第12期1940-1971,共32页
With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale app... With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale applications, workflow applications, and their checkpointing include substantial bandwidth and an extremely low latency, posing a serious challenge to high performance computing (HPC) storage systems. Current hard disk drive (HDD) based underlying storage systems are becoming more and more incompetent to meet the requirements of next-generation exascale supercomputers. To rise to the challenge, we propose a hierarchical hybrid storage system, on-line and near-line file system (ONFS). It leverages dynamic random access memory (DRAM) and solid state drive (SSD) in compute nodes, and HDD in storage servers to build a three-level storage system in a unified namespace. It supports portable operating system interface (POSIX) semantics, and provides high bandwidth, low latency, and huge storage capacity. In this paper, we present the technical details on distributed metadata management, the strategy of memory borrow and return, data consistency, parallel access control, and mechanisms guiding downward and upward migration in ONFS. We implement an ONFS prototype on the TH-1A supercomputer, and conduct experiments to test its I/O performance and scalability. The results show that the bandwidths of single-thread and multi-thread 'read'/'write' are 6-fold and 5-fold better than HDD-based Lustre, respectively. The I/O bandwidth of data-intensive applications in ONFS can be 6.35 timcs that in Lustre. 展开更多
关键词 High performance computing Hierarchical hybrid storage system Distributed metadata management Data migration
原文传递
Construction and application of LHAASO data processing platform 被引量:1
4
作者 Yaodong Cheng Haibo Li +7 位作者 Yujiang Bi Jingyan Shi Shan Zeng Hongmei Zhang Ge Ou Mengyao Qi Qiuling Yao Yaosong Cheng 《Radiation Detection Technology and Methods》 CSCD 2022年第3期418-426,共9页
Purpose The LHAASO project collects trillions of cosmic ray events every year,generating about 10 PB of raw data annually,which brings big challenges for data processing platform.Method The LHAASO data processing plat... Purpose The LHAASO project collects trillions of cosmic ray events every year,generating about 10 PB of raw data annually,which brings big challenges for data processing platform.Method The LHAASO data processing platform is built to handle such a large amount of data,which is composed of some subsystems such as data transfer,data storage,high throughput computing and metadata management.Results and conclusions The platform was under construction since 2018 and has been working well since 2021.In this paper,the details of the design,implementation and performance of the data processing platform are presented. 展开更多
关键词 LHAASO Data processing platform Data storage and management High-performance computing metadata management
原文传递
A Ceph-based storage strategy for big gridded remote sensing data
5
作者 Xinyu Tang Xiaochuang Yao +4 位作者 Diyou Liu Long Zhao Li Li Dehai Zhu Guoqing Li 《Big Earth Data》 EI 2022年第3期323-339,共17页
When using distributed storage systems to store gridded remote sensing data in large,distributed clusters,most solutions utilize big table index storage strategies.However,in practice,the performance of big table inde... When using distributed storage systems to store gridded remote sensing data in large,distributed clusters,most solutions utilize big table index storage strategies.However,in practice,the performance of big table index storage strategies degrades as scenarios become more complex,and the reasons for this phenomenon are analyzed in this paper.To improve the read and write performance of distributed gridded data storage,this paper proposes a storage strategy based on Ceph software.The strategy encapsulates remote sensing images in the form of objects through a metadata management strategy to achieve the spatiotemporal retrieval of gridded data,finding the cluster location of gridded data through hash-like calculations.The method can effectively achieve spatial operation support in the clustered database and at the same time enable fast random read and write of the gridded data.Random write and spatial query experiments proved the feasibility,effectiveness,and stability of this strategy.The experiments prove that the method has higher stability than,and that the average query time is 38%lower than that for,the large table index storage strategy,which greatly improves the storage and query efficiency of gridded images. 展开更多
关键词 Gridded images distributed storage system metadata management Ceph remote sensing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部