在对HDFS进行分析和研究的基础上,在HDFS文件分布式系统中应用File System API进行文件存储和访问,并通过改进的蚁群算法对副本选择进行优化。HDFS API能够有效完成海量数据的存储和管理,提高海量数据存储的效率。通过改进的蚁群算法提...在对HDFS进行分析和研究的基础上,在HDFS文件分布式系统中应用File System API进行文件存储和访问,并通过改进的蚁群算法对副本选择进行优化。HDFS API能够有效完成海量数据的存储和管理,提高海量数据存储的效率。通过改进的蚁群算法提升了文件读取时副本选择的效率,进一步提高了系统效率并使负载均衡。展开更多
The Scalable I/O (SIO) Initiative's Low-Level Application Pro- gramming Interface (SIO LLAPI)provides file system implementers with a simplelow-Level interface to support high-level parallel I/O interfaces efficie...The Scalable I/O (SIO) Initiative's Low-Level Application Pro- gramming Interface (SIO LLAPI)provides file system implementers with a simplelow-Level interface to support high-level parallel I/O interfaces efficiently and ef fectively. This paper describes a reference implementation and the evaluation of the SIO LLAPI on the Intel Paragon multicomputer. The implementation provides the file system structure and striping algorithm compatible with the Parallel File System (PFS) of Intel Paragon, and runs either inside the kernel or as a user level library. The scatter-gather addressing read/write, asynchronous I/O, client caching and prefetching mechanism, file access hint mechanism, collective I/O and highly efficient file copy have been implemented. The preliminary experience shows that the SIO LLAPI provides opportunities of significant performance improvement and is easy to implement. Some high level file system interfaces and applications, such as PFS, ADIO and Hartree-Fock application, are also implemented on top of SIO. The performance of PFS is at least the same as that of Intel's native PFS, and in many cases, such as small sequential file access, huge I/O requests and collective I/O, it is stable and much better. The SIO features help to support high level interfaces easily, quickly and more efficiently, and the cache, prefetching, hints are useful to get better performance based on different access models. The scalability and per formance of SIO are limited by the network latency, network scalable bandwidth,memory copy bandwidth, memory size and pattern of I/O requests. The tradeoff between generality and efficiency should be considered in implementation.展开更多
文摘在对HDFS进行分析和研究的基础上,在HDFS文件分布式系统中应用File System API进行文件存储和访问,并通过改进的蚁群算法对副本选择进行优化。HDFS API能够有效完成海量数据的存储和管理,提高海量数据存储的效率。通过改进的蚁群算法提升了文件读取时副本选择的效率,进一步提高了系统效率并使负载均衡。
文摘The Scalable I/O (SIO) Initiative's Low-Level Application Pro- gramming Interface (SIO LLAPI)provides file system implementers with a simplelow-Level interface to support high-level parallel I/O interfaces efficiently and ef fectively. This paper describes a reference implementation and the evaluation of the SIO LLAPI on the Intel Paragon multicomputer. The implementation provides the file system structure and striping algorithm compatible with the Parallel File System (PFS) of Intel Paragon, and runs either inside the kernel or as a user level library. The scatter-gather addressing read/write, asynchronous I/O, client caching and prefetching mechanism, file access hint mechanism, collective I/O and highly efficient file copy have been implemented. The preliminary experience shows that the SIO LLAPI provides opportunities of significant performance improvement and is easy to implement. Some high level file system interfaces and applications, such as PFS, ADIO and Hartree-Fock application, are also implemented on top of SIO. The performance of PFS is at least the same as that of Intel's native PFS, and in many cases, such as small sequential file access, huge I/O requests and collective I/O, it is stable and much better. The SIO features help to support high level interfaces easily, quickly and more efficiently, and the cache, prefetching, hints are useful to get better performance based on different access models. The scalability and per formance of SIO are limited by the network latency, network scalable bandwidth,memory copy bandwidth, memory size and pattern of I/O requests. The tradeoff between generality and efficiency should be considered in implementation.