期刊文献+

基于可比语料库的双语术语抽取技术研究

Research of Bilingual Term Extraction Based on Comparable Corpora
下载PDF
导出
摘要 对双语术语抽取技术中的一项重要分支:基于可比语料库的双语术语抽取技术进行了综述分析。当前研究者采用的方法依据是“上下文相似”理论,即两个在源语言共现的词,对应到目标语言中的两个词也将共现。当前技术主要包含候选词的上下文特征的模型构造和上下文特征模型的优化。对已有的研究给出了一个初步的评价标准,分别对两项研究按照方法复杂度层次进行分析总结,指出存在的问题。最后对基于可比语料库的双语术语抽取技术的未来进行了展望。 This article gives a research survey on the bilingual term extraction based on comparable corpora, which is a branch of bilingual term extraction. Most researchers use the Context-similar theory, which claims that if two words appear nearly in the source text then their translations could appear nearly in the target text. The bilingual term extraction based on comparable corpora includes two tasks: the context features models and the optimization of the context features models. The status of this technology has been analyzed in detail by the generation of method. And the problems have been discussed during analyzing. In the end, the paper presents the prospects of the study of the bilingual term extraction based on comparable corpora. According to these researchers' experiment result, this technology can be used in machine aided translation and building bilingual dictionary.
作者 俞卓 黄河燕
出处 《情报学报》 CSSCI 北大核心 2011年第12期1286-1292,共7页 Journal of the China Society for Scientific and Technical Information
基金 本文为国家“863”高新技术研究发展计划基金项目,项目编号2006AA010109.
关键词 基于可比语料库的双语术语抽取 双语语料库 可比语料库 上下文特征 bilingual term extraction based on comparable corpora, bilingual corpora, comparable corpora, context features
  • 相关文献

参考文献39

  • 1Izuha T.Machine translation using bilingual term entries extracted from parallel texts[J].IEIC Technical Report,2001,101(89):1-7.
  • 2Miangah T.Automatic term extraction for cross-language information retrieval using a bilingual parallel corpus[C] //Proceedings of the 6th International Conference on Informatics and Systems Special Track on Natural Language Processing,2008:81-84.
  • 3Church K,Gale W,Fung P,et al.Aligning parallel texts:do methods developed for English-French generalize to Asian languages[C] //Proceedings of Pacific Asia Confe-rence on Formal and Computational Linguistics,1993.
  • 4Cheung P,Fung P.Sentence alignment in parallel,com-parable,and quasi-comparable corpora[C] //Proceedings of LREC,2004.
  • 5Fung P,McKeown K.Aligning noisy parallel corpora across language groups:word pair feature matching by dynamic time warping[C] //Proceedings of Association of Machine Translation in the Americas,1994.
  • 6Fung P.A pattern matching method for finding noun and proper noun translations from noisy parallel corpora[C] //Proceedings of 33rd Annual Conference of the Association for Computational Linguistics,1995.
  • 7Fung P.Compiling bilingual lexicon entries from a non-parallel English-Chinese corpus[C] //Proceedings of Third Annual Workshop on Very Large Corpora,1995.
  • 8Fung P,Wu D.Coerced markov models for cross-lingual lexical-tag relations[C] //Proceedings of Sixth Intern-ational Conference on Theoretical and Methodological Issues in Machine Translation,1995:240-255.
  • 9Fung P,McKeown K.A technical word and term transl-ation aid using noisy parallel corpora across language groups[J].The Machine Translation,1997,12(1-2):53-87.
  • 10Fung P.Domain word translation by space-frequency an-alysis of context length histograms[C] // Proceedings of ICASSP′96:International Conference on Acoustics,Signal and Speech,1996.

二级参考文献18

  • 1Tony M E.Multilingual corpora-current practice and future trends[C]//Proc of the 19th ASLIB Machine Translation Conference,London,1997:71-83.
  • 2Tanka K,Iwasaki H.Extraction of lexical translations from nonaligned corpora[C]//Proc of International Conference on Computational Linguistics(COLING 96),1996.
  • 3Fung Pascale,Yee Lo Yuen.An IR approach for translating new words from nonparallel,comparable texts[C]//Proc of the 36th Conference for AC L,Montreal,1998:414-420.
  • 4Fung Pascale.Extracting key terms from japanese and Chinese texts[J].Computer Processing of Oriental Languages,1998,12 (1):99-122.
  • 5Rapp R.Automatic identification of word translation from unrelated English-German corpora[C]//Proc of the 37th ACL,College Park,Maryland,1999.
  • 6Mona Diab,Steve Finch.A statistical word-level translation model for comparable corpora[C]//Proc of RIAO 2000,Paris,France,2000.
  • 7Chiao Y-C,Sta J-D,Zweigenbaum P.A novel approach to improve word translations extraction from non-parallel,comparable corpora[C]//Proc of the First International Joint Conference on Natural Language Processing,Sanya,Hainan Island,China,2004.
  • 8Chiao Y C,Zweigenbaum P.Looking for candidate translational equivalents in specialized,comparable corpora[C]//Proc of COLONG2002,2002.
  • 9Fatiha Sadat,Herve Dejean,Eric Gaussier.A combination of models for billingual lexicon extraction from comparable corpora[C]//Proc of Papillon 2002 Seminar,Tokyo,Japan,2002:16-18.
  • 10Fatiha Sadat,Masatoshi Yoshikawa,Shunsuke Uemura.Billingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval[C]//Proc ACL2003,Sapporo,Japan,2003.

共引文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部