

Patent literatures translation system based on Hadoop
摘要 为了解决大量专利数据的存储和翻译问题,设计了一种基于Hadoop的专利翻译系统。针对数据存储该系统采用了HDFS和HBase相结合的混合式存储结构,对于翻译过程则采用Hadoop并行翻译模型-MapReduce。通过实验证明,相比于传统的翻译方法该系统具有更好的数据存储和翻译性能。 In order to tackle the problem of storing and translation of massive patent literatures,a patent literatures translation system based on Hadoop is proposed in this paper. The paper presents a hybrid storage structure which combines HDFS and HBase,and a parallel translation model-MapReduce. The experimental results show that the proposed machine translation system has better translation performance than the conventional machine translation approach.
作者 柴晓辉
出处 《信息技术》 2015年第10期30-33,37,共5页 Information Technology
基金 国家重点基础研究发展计划(973计划)资助项目(2013CB329303)
关键词 专利翻译 HADOOP MAPREDUCE HBASE HDFS patent translation Hadoop MapReduce HBase HDFS
  • 相关文献


  • 1WIPO [ EB/OL ]. http: //ipstatsdb. wipo. org/ipstatv2/ipstats/ searchresultsTable.
  • 2Tamer eOzsu M, Valduriez p. Principles of distributed database sys- tems[ M ]. Springer, 2011.
  • 3Stonebraker M, SQL databases v. NoSQL databases[ J]. Communi- cations of the ACM, 2010, 53(4) : 10 -11.
  • 4Dimiduk N, Khurana A, Ryan M H. HBase in Action[ M]. Man- ning, 2013.
  • 5Taylor R C. An overview of the Hadoop/MapReduce/HBase frame- work and its current applications in bioirfformatics[ J ]. BMC bioin- formatics, 2010, 11(Suppl 12) : S1.
  • 6Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file system[ C]//Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on. IEEE. 2010:1 - 10.
  • 7Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters [ J ]. Communications of the ACM, 2008, 51 ( 1 ) : 107 - 113.
  • 8Dean J, Ghemawat S. MapReduce: a flexible data processing tool [J]. Communications of the ACM, 2010, 53( 1 ) : 72 -77.
  • 9Dyer C, Cordova A, Mont A, et al. Fast, easy, and cheap: Con- struction of statistical machine translation models with MapReduce [ C ]//Proceedings of the Third Workshop on Statistical Machine Translation. Association for Computational Linguistics, 2008: 199 - 207.
  • 10Gao Q, Vogel S. Training phrase-based machine translation models on the cloud: Open source machine translation toolkit Chaski [ J ]. The Prague Bulletin of Mathematical Linguistics, 2010, 93 ( 1 ) : 37 - 46.








使用帮助 返回顶部