期刊文献+

双向词典和语义相似度计算相结合的词对齐算法 被引量:1

Word-alignment algorithm combined with bidirectional dictionary and semantic similarity calculation
下载PDF
导出
摘要 基于统计的词对齐方法需要大规模的双语语料作为输入,难以避免数据稀疏的问题并且算法时间开销大。针对句子或段落级的实时性对齐需求,提出了一种基于双向词典和语义相似度计算的高效词对齐算法,通过采用动态组块切分和匹配、基于知网的语义相似度计算、基于最大匹配的冲突消解和剪枝消歧等策略,有效地解决了由于翻译的灵活性和多样性带来的近似译文的词对齐问题。实验表明,该算法不仅继承了基于词典词对齐算法的优点,同时还改进了传统基于词典词对齐算法的不足,有效提升了词对齐的正确率和召回率,在小规模双语语料和实时性对齐方面具有更好的适用性。 Word-alignment based on statistical method requiresa large-scale bilingual corpus as input,soit is difficult to avoid the problem of data sparse and the algorithmtime overhead. This paper presents anefficient word-alignment algorithm based on bidirectional dictionary and semantic similarity calculation to satisfy the demand for real-time alignment of sentence or paragraph level. The approximate translation of word-alignment problem due to the flexibility and diversity of translation can beeffectively solved by taking dynamic block segmentation and matching,semantic similarity calculation based on the HowNet,the conflict resolution based on the maximum matching and the pruning disambiguation. Compared with the standard algorithm,the experimental results show that the accuracy rate and recall ratecan be effectively improved bythis alignment method on a small-scalebilingual corpus and real-timealignment with better adaptability.
作者 尹宝生 杨阳
出处 《沈阳航空航天大学学报》 2015年第2期67-74,共8页 Journal of Shenyang Aerospace University
基金 辽宁省百千万人才基金项目(项目编号:04021401)
关键词 词对齐 双向词典 动态组块切分和匹配 语义相似度计算 word-alignment bidirectional dictionary dynamic block segmentation and matching semanticsimilarity calculation
  • 相关文献

参考文献17

  • 1Brown P, Della P S, Della P V, et al. The mathematics of statistical machine translation: parameter estimation [ J ]. Computational Linguistics, 1993,19 ( 2 ) : 263 - 311.
  • 2Nagao M. A framework of a mechanical translation be- tween japanese and english by analogy principle [A]. In: A. Elithom andR. Baneji, editors, Artificial and Hu- man Intelligence, 1984 : 173 - 180.
  • 3Al-Onaizan Y, Curin J, Jahr M, et al. Statistical ma- chine translation, final report, JHU workshop [ DB/ OL]. http://www, clsp. jhu. edu/ws99/projects/mt/ final_report/rot-final-report, ps, 1999.
  • 4Brown P F, Cocke J, Della -Pietra S A, et al. A statis- tical approach to machine translation[ J]. Computation- al Linguistics, 1990,16 (2) :79 - 85.
  • 5Och F J, Ney H. Improved statistical alignment models [ C]. Proceedings of 38th Annual Meeting of Associa- tion for Computational Linguistics. Hong Kong, China, 2000:440 - 447.
  • 6Och F J, Ney H. A comparison of alignment models for statistical machine translation [ C ]. Proceedings of the 18th International Conference on Computational Linguistics. Saarbrucken, Germany, 2000 : 1086 - 1090.
  • 7邓丹,刘群,俞鸿魁.基于双语词典的汉英词语对齐算法研究[J].计算机工程,2005,31(16):45-47. 被引量:8
  • 8张孝飞,陈肇雄,黄河燕,王建德.基于锚点词对的双语词对齐算法[J].小型微型计算机系统,2006,27(2):330-334. 被引量:10
  • 9晋薇,黄河燕,夏云庆.基于语义相似度并运用语言学知识进行双语语句词对齐[J].计算机科学,2002,29(11):44-47. 被引量:6
  • 10Yang N ,Liu S J ,Li M, et al. Word alignment modeling with context dependent deep neural network [ C ]. Pro- ceedings of 51th Annual Meeting of Association for Computational Linguistics. Sofia, Bulgaria,2013 : 166 - 175.

二级参考文献53

  • 1刘亚军,徐易.一种基于加权语义相似度模型的自动问答系统[J].东南大学学报(自然科学版),2004,34(5):609-612. 被引量:35
  • 2夏天,樊孝忠,刘林,骆正华.基于ALICE的汉语自然语言接口[J].北京理工大学学报,2004,24(10):885-889. 被引量:11
  • 3Ker S J,Chang J S. Aligning More Words with High Precision for Small Bilingual Corpora. Computational Linguistics and Chinese Language Processing, 1997,2(2): 63~96
  • 4Brown R D. ExamPle-Based Machine Translation in the Pangloss System. In:Proc. of the 16 th Intl. Conf. on Computational Linguistics (COLING-96),Copenhagen, Denmark ,Aug. 1996. 169~174
  • 5Gao Zhao-Ming. Automatic Acquisition of a High-Precision Translation Lexicon from Parallel Chinese-English Corpora. Department of Foreign Language and Literatures National Taiwan University
  • 6Huang Jin-Xia. Key-Sun Choi Using Bilingual Semantic Information in Chinese-Korean Word Alignment. Korterm, Computer Science Division Korean Advanced Institute of Science and Technology
  • 7Dagan I. Bilingual Word Alignment and Lexicon Construction. Tutorial Paper Given at the International Conference of Computational Linguistics, Copenhagen
  • 8刘群 李素建.基于《知网》的词汇语义相似度计算[C]..第三界汉语词汇语义研讨会[C].台北,2002..
  • 9Ker S J, Chang J S. Align More Words with High Precision for Small Bilingual Corpora[J]. Computational Linguistics and Chinese Language Processing, 1997, 2(2): 63-96
  • 10Huang Jinxia, Key-Sun Choi. Chinese-Korean Word Alignment Based on Linguistic Comparison[C]. In: Annual Meeting of the Association for Computational Linguistics, 2000: 392-399

共引文献100

同被引文献8

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部