摘要
语料库词汇一级的对齐,对于充分发挥语料库的作用意义重大。本文对汉英句子一级对齐的语料库,提出了借助于词典和语料库统计信息的有效的对齐算法。首先利用词典的词的译文及其同义词在目标语中寻找对齐;其次利用汉语词汇与英语单词的共现统计信息以最大的互信息寻找对齐词汇以及相邻短语。
Aligning the bilingual corpus at word level is very important to take the advantages of corpus.This paper presents an efficient aligning algorithm for a corpus aligned at sentence level,using the lexical information and statistic information. First,the information of dictionary and thesaurus is used.Second,the mutual information between Chinese words(or adjacent phrases)and English words(of adjacent phrases)is used.Our experiments has proved this method to be effective.
出处
《情报学报》
CSSCI
北大核心
1997年第1期21-27,共7页
Journal of the China Society for Scientific and Technical Information