摘要
针对目前我国阿拉伯语地名的机器翻译研究空白,该文通过分析阿拉伯语地名的词法结构以及语音特点,提出了一种关于阿拉伯语地名的机器翻译方法:首先基于点互信息计算公式训练大量地名语料提取常用词;接着通过有向无环图数据结构提取地名模板;然后基于模板匹配解析待译地名词法结构,利用基于音节划分的音译模型音译词法结构中的专名;最后组合输出翻译结果。经过对阿语地名翻译实验,验证了本专用阿拉伯语地名机器翻译方法的有效性,对我国全球地理信息资源建设具有重要的现实意义。
Aiming at the blank of machine translation of Arabic place names in China,this paper proposed a machine translation scheme for Arabic place names by analyzing the morphological structure and phonetic characteristics of Arabic place names.First,a large number of toponymic corpus was trained to extract common words based on point mutual information calculation formula,then toponymic template was extracted through directed acyclic graph data structure,then the terminological structure of untranslated terminology was parsed based on template matching,the proper names in the transliteration lexical structure were used based on syllable partition model,and finally the result of translation was combined and output.Through the experiment of Arabic place names translation,the validity of the machine translation method of Arabic place names was verified,which had important practical significance for the construction of global geographic information resources in China.
作者
任洪凯
王继周
毛曦
马维军
殷红梅
REN Hongkai;WANG Jizhou;MAO Xi;MA Weijun;YIN Hongmei(Shandong University of Scienceand Technology,Qingdao,Shandong 266590,China;Chinese Academy of Surveying&Mapping,Beijing 100036,China)
出处
《测绘科学》
CSCD
北大核心
2020年第8期157-163,共7页
Science of Surveying and Mapping
基金
中国测绘科学研究院基本科研业务经费项目(AR1912)。
关键词
地名
机器翻译
点互信息
有向无环图
词法结构解析
音节划分
正向最大匹配算法
place names
machine translation
point mutual information
directed acyclic graph
lexical structure analysis
syllable division
forward maximum matching algorithm