期刊文献+

英文句子边界自动识别 被引量:7

Auto Detection for English Sentence Boundaries
下载PDF
导出
摘要 英语句子边界的识别是英文文本分析的基本问题 ,是进一步进行英汉机器翻译的基础。本文采用了统计决策树与错误驱动相结合的方法进行英语句子边界的识别 ,首先通过决策树学习训练语句中的句子划分规则 ,之后用错误驱动的方法对所获得的结果进一步修正 ,对 Penn Tree-Bank语句进行测试 ,正确率达到 98.6 %。 Sentence boundary identification is essential to English text analysis and machine translation. This paper proposes a strategy that combines decision tree with error-driven method to labeling English sentence boundaried. It achieves 98.6% accuracy over part of Penn TreeBank texts.
出处 《微处理机》 2003年第1期30-34,共5页 Microprocessors
关键词 英文句子边界 自动识别 机器翻译 语句边界检测 决策树 学习算法 自然语言处理 sentence boundaries detection,decision tree,error-driven,rules
  • 相关文献

参考文献6

  • 1[1]Riley, M. D. Some Application of Tree- Modeling to Speech and Language Indexing. In Proceedings of the DARPA Speech and Natural Language Workshop,1989:339~352.
  • 2[2]Humphrey, T. , and Zhou, F. Period Disambiguation Using a Neural Network. In IJCNN : International Joint Conference on Neural Networks ,1989 : 606
  • 3[3]Palmer, D. D., and Hearst, M. A. 1994 Adaptive Sentence Boundary Disambiguation. UC Berkeley Computer Science Technical Report Number UCB/CSD -94-797. Also CL,1997
  • 4[4]David D. Plamer, 1995. Experiments in Multilingual Sentence Boundary Recognition; Proc. of Recent Advances In Natural Language Processing, Bulgaria,1995
  • 5[5]Andrei Mikheev, 1994. Periods, Capitalized Words etc.Computational Linguistics, 9884 (Vo116: No. 1)
  • 6[6]Andrei Mikheev, 1999. A Knowledge-free Method for Capitalized Word Disambiguation,Proc. of 37th Annual Meeting of the ACL, 1999

同被引文献38

引证文献7

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部