期刊文献+

基于统计的汉英法律文献亚句子级对齐 被引量:7

Sub-Sentence Alignment of Chinese-English Law Literature Based on Statistical Approach
下载PDF
导出
摘要 基于统计的句子对齐是根据双语句子的长度在所有可能的对齐中找出概率最大的一个·提出两种对齐汉英语料的参数计算方法,使对齐模型中的评价函数满足标准正态分布·参数s2的值是对训练语料中的所有点(l1,(l2-cl1)2)进行线性回归分析所得直线的斜率,s2的另一种求法是直接计算方差·实验结果表明汉英法律文献亚句子级对齐的正确率为98 8%,召回率为99 2%· Sentence alignment based on statistical approach is the choice of alignment with maximum probability from all candidates according to the length of bilingual sentences. ChineseEnglish law literature is translated literally, so it is suitable to be aligned with statistical approach. But the method used to compute the parameters in processing IndoEuropean languages cannot be applied to ChineseEnglish corpora. Two parameter computation methods for aligning ChineseEnglish corpora were presented. The method make the evaluation function satisfy the standard normal distribution. One method to get the parameter s2 is to compute slope of the line generated by linear regression analysis to all point (l1,(l2-cl1)2) in the training corpora. The other is to compute the variance. Test results show that the precision rate and recall rate of alignment are 98.8% and 99.2 % respectively.
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2003年第1期23-26,共4页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(60083006) 国家重点基础研究发展规划资助项目(G19980305011).
关键词 双语语料库 汉英法律文献 亚句子级对齐 统计方法 评价函数 参数计算 标准正态分布 汉语 英语 机器翻译 bilingual corpora Chinese-English law literature sub-sentence alignment statistical approach evaluation function parameter computation standard normal distribution
  • 相关文献

参考文献10

  • 1朱靖波,姚天顺.词义自动消歧概率模型[J].东北大学学报(自然科学版),2000,21(5):484-486. 被引量:5
  • 2Brown P F, Cocke J, Della P S, et al. A statistical approach to machine translation[J]. Computational Linguistics, 1990,16(2):79-85.
  • 3Nagao M. A framework of a mechanical translation between Japanese and English by analogy principle[A]. In: Elithorn A, Banerji R. Artificial and Human Intelligence, Edited Review Papers Presented at the International NATO Symposium[C]. Amsterdam: NATO Publications, 1984.173-180.
  • 4Wu D, Xia X. Large-scale automatic extraction of an English-Chinese translation lexicon[J]. Machine Translation, 1995,9(3-4):285-313.
  • 5Guvenir H A, Cicekli I. Learning translation templates from examples[J]. Information Systems, 1998,23(6):353-363.
  • 6Brown P F, Lai J C, Mercer R L. Aligning sentences in parallel corpora[A]. Proceedings of 29th Annual Meeting of the Association for Computational Linguistics Berkeley[C]. CA:ACL, 1991.169-176.
  • 7Gale W A, Church K W. A program for aligning sentences in bilingual corpora[J]. Computational Linguistics, 1993,19(1):75-102
  • 8Kay M, Roscheisen M. Text-translation alignment[J]. Computational Linguistics, 1993,19(1):121-142.
  • 9Wu D. Aligning a parallel English-Chinese corpus statistically with lexical criteria[A]. Proceedings of the 32th Annual Conference of the Association for Computational Linguistics. Las Cruces[C]. NM: ACL, 1994.80-87.
  • 10吕学强,李清隐,陈文亮,姚天顺.汉英法律文献的子条级自动索引和对齐[J].中文信息学报,2002,16(4):52-59. 被引量:2

二级参考文献12

  • 1[1]Rajashekar T B, Croft W B. Combining Automatic and Manual Index Representations in Probabilistic Retrieval. Journal of the American Society for Information Science, 1995,46(4) :272 - 283
  • 2[2]Nagao M. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In: Elithom A and Banerji R. Artificial and Human Intelligence,Edited Review Papers presented at the International NATO Symposium. Amsterdam: NATO Publications, 1984,173 - 180
  • 3[3]Nagao M. Machine Translation: How Far Can It Go? New York: Oxford University Press, 1989
  • 4[4]Brown P F, Cocke J, Della P S, et al. A Statistical Approach to Machine Translation. Computational linguistics, 1990,16(2) :79 - 85
  • 5[5]Brown, P F, Della P S, Della P V, et al. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational linguistics, 1993,19(2): 263 - 312
  • 6[6]Abney S. Statistical Methods and Linguistics. In: Judith L K and Philip R. The Balancing Act: Combining Symbolic and Statistical Approaches to Language. Cambridge: MIT Press, 1996:1 -26
  • 7[7]Brown P F, Lai J C, Mercer R L. Aligning Sentences in Parallel Corpora. In: Proceedings of 29th Annual Meeting of the Association for Computational Linguistics. Berkeley, CA: ACL, 1991,169 - 176
  • 8[8]Gale W A,Church K W. A Program for Aligning Sentences in Bilingual Corpora. Computational linguistics, 1993,19(1): 75 - 102
  • 9[9]Kay M,Roscheisen M. Text - Translation Alignment. Computational linguistics, 1993,19(1): 121 - 142
  • 10[10]Chen S F. Aligning Sentences in Bilingual Corpora Using Lexical Information. In: Proceedings of 31 st Annual Meeting of the Association for Computational Linguistics. Columbus, OH: ACL, 1993,9 - 16

共引文献5

同被引文献93

引证文献7

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部