期刊文献+

基于成分共享的英汉小句对齐语料库标注体系研究 被引量:2

English-Chinese Clause Alignment Corpus Tagging System Based on Component Sharing
下载PDF
导出
摘要 英汉小句对齐语料库服务于英语和汉语小句的语法结构对应关系研究和应用,对于语言理论和语言翻译(包括人的翻译和机器翻译)有重要意义。前人的语法理论和相关语料库的工作对于小句复合体和小句的界定缺乏充分研究,在理论上有缺陷,难以支持自然语言处理的应用。该文首先为英汉小句对齐语料库的建设做理论准备。从近年提出的汉语小句复合体的理论出发,该文界定了成分共享的概念,基于话头共享和引语共享来界定英语的小句和小句复合体,使小句和小句复合体具有功能的完整性和单一性。在此基础上,该文设计了英汉小句对齐的标注体系,包括英语NT小句标注和汉语译文生成及组合。语料库的标注表明,在小句复合体层面上英汉翻译涉及到的结构变换,其部件可以限制为英语小句和话头、话体,无须涉及话头和话体内部的结构。基于这些工作的英汉小句对齐语料库为语言本体研究和英汉语言对比、英汉机器翻译等应用提供了结构化的标注样本。 English-Chinese clause alignment corpus serves the study and application of grammatical structure correspondence between English and Chinese clauses.It is of great significance to linguistic theory and language translation(including human translation and machine translation).Previous work on grammar theory and corpus lacks sufficient research on definitions of clause and clause complex.It is theoretically defective and difficult to support the application of natural language processing.Firstly,this paper makes theoretical preparations for the construction of English-Chinese clause alignment corpus.Starting from the theory of Chinese clause complex put forward in recent years,this paper defines the concept of component sharing,and further defines English clause and clause complex based on naming sharing and quotation sharing,which endows clause and clause complex with integrity and unity.Based on the study,an English-Chinese clause alignment annotation system is designed,including English NT clause tagging and Chinese translation generation and combination.The corpus annotation shows that,at the clause complex level,the components involved by the structural transformation in English-Chinese translation can be limited to English clauses,and related naming and telling,without involving the internal structure of namings and tellings.Based on these works,the English-Chinese clause aligned corpus provides research samples for linguistic research,English-Chinese language comparison,and English-Chinese machine translation.
作者 葛诗利 宋柔 GE Shili;SONG Rou(Laboratory of Language and Artificial Intalligerce,Guangdong University of Foreign Studies,Guangzhou,Guangdong 510420,China;School of Information Science,Beijing Language and Culture University,Beijing 100083,China)
出处 《中文信息学报》 CSCD 北大核心 2020年第6期27-35,共9页 Journal of Chinese Information Processing
基金 国家自然科学基金(61672175) 国家语委重点项目(ZDI135-30)
关键词 成分共享 话头共享 小句 小句复合体 英汉机器翻译 component sharing naming sharing clause clause complex English-Chinese machine translation
  • 相关文献

参考文献1

二级参考文献7

共引文献2

同被引文献6

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部