期刊文献+

一种汉英双语句子自动对齐算法 被引量:4

An Algorithm for Automatic Sentence Alignment of English and Chinese Parallel Corpora
下载PDF
导出
摘要 双语语料库建设及其自动对齐研究对计算语言学的发展具有重要的意义。双语对齐技术是加工双语文本的核心,对齐效果的好坏直接影响了以后工作(诸如机器辅助翻译)的进行。基于汉英双语的实际情况,提出了一种新的句子对齐混合算法,该算法主要采用一种新的基于长度的对齐算法,并结合基于词典的对齐算法,通过正反双向对齐,进一步提高了句子对齐的准确率。最后通过100个文件,5000多句英汉双语对该算法进行了验证,从对齐效果可以发现,结果比较理想,因而可以证明,该算法在实际工作中是可行的。 Bilingual corpus and its automatic alignment are of great significance to the development of computational linguistics. As the key technology during the course of building corpus, bilingual alignment technology has a direct impact on the future work (such as computer- assisted translation) process. Based on the actual situation of Chinese- English bilingual, this paper proposes a new hybrid algorithm for sentence alignment, which is mainly based on the length - based method and combined with the lexicon - based method. Through the pros and cons of two -way alignment, this algorithm further improved the accuracy of the sentence alignment. Finally, by using 100 documents, more than 5,000 English -Chinese bilingual sentences, the algorithm was verified, and from the effects of alignment it can be found that the results are satisfactory, and the algorithm in practical work is feasible.
出处 《计算机仿真》 CSCD 北大核心 2009年第2期329-333,共5页 Computer Simulation
关键词 双语语料库 句子对齐 混合算法 Bilingual corpus Sentence alignment Hybrid algorithm
  • 相关文献

参考文献2

二级参考文献1

  • 1Wu D,Machine Translation,1995年,9卷,3/4期,285页

共引文献26

同被引文献23

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部