期刊文献+

一种提高海量电子健康档案存储性能的方法 被引量:2

AN APPROACH FOR IMPROVING STORAGE PERFORMANCE OF MASS ELECTRONIC HEALTH RECORDS
下载PDF
导出
摘要 医疗旅游是目前兴起的一个新型产业,面对今后日益增长的巨大数据,有效数据的存储和用户的快速访问是急需解决的课题。Hadoop的出现满足了这一需求。但Hadoop并不适合用来处理大量的小文件,其HDFS(Hadoop distributed file system)采用主从架构,存储大量的小文件时,元数据快速增加,Name Node内存被大量占用,读取性能也受到一定的影响,直接降低了整个系统的扩展性及效率。利用RDBMS和Hadoop的优势,提出一种改进的小文件存储优化方案,同时又根据电子健康档案数据的特点,提出按副本组进行数据传输存储的方案,并采用数据预取机制,提高访问效率。实验表明,该方法能有效提高电子健康档案中的小文件存储和读取的性能,一定程度上很好地解决了NameNode内存瓶颈问题。 Medical tourist is a newly arisen industry currently. Facing the growing huge data in the future, the storage of valid data and the quick user accessing is a question to be solved urgently. The emergence of Hadoop satisfies the demand. However, Hadoop is not suitable for dealing with massive small files. Its HDFS ( Hadoop distributed file system) adopts the master-slave architecture, when storing a large number of small files, the metadata increases rapidly, and huge amount of NameNode RAM is occupied, the reading performance is also impacted, these reduce the scalability and efficiency of the whole system directly. By utilising the advantages of RDBMS and Hadoop, this paper proposes an improved optimisation scheme of small files storage, and also proposes a scheme of data transmission and storage according copy groups based on the characteristics of digital health archives. And we also use data prefetching mechanism to improve the accessing efficiency. Experiment shows that the method can improve storing and reading performances of digital health archives effectively. It solves the bottleneck problem of NameNode memories to a certain extent.
作者 杨志芬 陈绮
出处 《计算机应用与软件》 CSCD 2016年第1期21-23,41,共4页 Computer Applications and Software
基金 海南省教育厅自然科学类重点项目(Hj kj2013-03)
关键词 HADOOP HDFS 小文件 存储效率 数据预取 Hadoop HDFS Small files Storage efficiency Data prefetching
  • 相关文献

参考文献10

  • 1陆嘉恒.Hadoop实战[M].北京:机械工业出版社,2012.
  • 2Bo Dong,Qinghua Zheng,Feng Tian,et al.An optimized approach for storing and accessing small files on cloud storage[J].Journal of Network and Computer Applications,2012,35(6):1847-1862.
  • 3Quan Zhang,Dan Feng,Fang Wang.Metadata Performance Optimization in Distributed File System[C]//Proc.of the 2012 IEEE/ACIS11th International Conference on Computer and Information Science.Shanghai,China:IEEE,2012:476-481.
  • 4李彭军,陈光杰,郭文明.基于HDFS的区域医学影像分布式存储架构设计[J].南方医科大学学报,2011,31(3):495-498. 被引量:29
  • 5Chansler R J.Data Availability and Durability with the Hadoop Distributed File System[J].login:The USENIX Magazine,2012,37(1):16-22.
  • 6Chuncong Xu,Xiaomeng Huang,Nuo Wu,et al.Using Memcached to Promote Read Throughput in Massive Small-File Storage System[C]//Proc.of the 2010 Ninth International Conference on Grid and Cloud Computing.Beijing,China:GCC,2010:24-29.
  • 7Bo Dong,Jie Qiu,Qinghua Zheng,et al.A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop:a Case Study by Power Point Files[C]//International Conference on Services Computing,2010:65-72.
  • 8Xuhui Liu,Jizhong Han,Yunqin Zhong,et al.Implementing Web GIS on Hadoop:A Case Study of Improving Performance on HDFS[C]//Proc.of the 2009 IEEE Conf.on Cluster Computing:1-8.
  • 9张春明,芮建武,何婷婷.一种Hadoop小文件存储和读取的方法[J].计算机应用与软件,2012,29(11):95-100. 被引量:39
  • 10赵晓永,杨扬,孙莉莉,陈宇.基于Hadoop的海量MP3文件存储架构[J].计算机应用,2012,32(6):1724-1726. 被引量:28

二级参考文献26

  • 1刘仲明,王放,郑小林.医院影像归档与存储系统中影像数据长期存储问题的研究[J].第三军医大学学报,2005,27(11):1123-1126. 被引量:14
  • 2Hadoop. http://hadoop.apache.org.
  • 3Tom W. Hadoop: The Definitive Guide[M]. USA: O' Reilly Media Inc., 2009: 41-2.
  • 4巨鲸网[EB/OL].[2011-11-08].http://topl00.on/.
  • 5WHITE T. Hadoop: The definitive guide[ M]. [ S. 1. ] : O'Reilly Media, 2009.
  • 6Small files problem[ EB/OL]. [ 2011- 11 - 10]. http://www, cloud- era. conr/blog/2009/02/the-small-files-problem/.
  • 7MACKEY G, SEHRISH S, WANG JUN. Improving metadata man- agement for small files in HDFS[ C]//Proceedings of 2009 IEEE In- ternational Conference on Cluster Computing and Workshops. Piscat- away: IEEE Press, 2009:1 -4.
  • 8LIU XUHUI, HAN JIZHONG, ZHONG YUNQIN, et al. Implemen- ting WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS[ C]//2009 IEEE International Conference on Cluster Computing and Workshops. Piscataway: IEEE Press, 2009: 1-8.
  • 9DONG BO, QIU JIE, ZHENG QINGHUA, et al. A novel approach to improving the efficiency of storing and accessing small files on Ha- doop: a case study by PowerPoint flies[ C]// Proceedings of the 2010 IEEE International Conference on Services Computing. Wash- ington, DC: IEEE Computer Society, 2010:65 -72.
  • 10Hadoop sequence file[ EB/OL]. [ 2011- 11- 12]. http://hadoop, a- pache, org/common/docs/current/api/org/apache/hadoop/io/Se- quenceFile, htm.

共引文献102

同被引文献11

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部