期刊文献+

统计语言模型能做什么? 被引量:31

What Can We Do with Statistical Language Mode?
下载PDF
导出
摘要 20年来中文信息处理取得了巨大成绩 ,这是有目共睹的。当前摆在学界面前的一个重要任务是确立全局的战略目标 ,并尽快在一些社会急需的发展方向上取得实质性的突破。为此 ,首先要澄清某些认识 ,比如中文信息处理是不是一定要在汉语理解的基础上推进 ?对于解决中文信息处理的一些急需课题来说 ,究竟什么方法是最适用的 ?本文首先对国内外自然语言处理的历史作了一个简短的回顾 ,说明从小规模受限语言处理走向大规模真实文本处理 ,是一个不可抗拒的历史潮流。并通过一些具体的实例来说明 :统计语言模型能解决什么问题 ?它为什么在一些有可比评测的课题上连连胜出 ?借此阐明 ,具有统一测试数据和统一计分方法的可比评测是推动科学技术进步的有力杠杆。我们应当拿起这个武器。 Obviously Chinese information processing (CIP) has attained outstanding achievements in the past two decades. The most important task facing the research community today is to establish the strategic objective of CIP, and make essential breakthroughs as soon as possible on certain development directions urgently needed by the society. For this purpose, some ideas need to be clarified first. For example, is it necessary to push forward CIP research based on Chinese language understanding? For those urgently needed CIP projects, what is the most appropriate approach? The paper first makes a brief survey on the international history of natural language processing (NLP), and points out that the moving from small scale restricted NLP to large scale running text processing is an uncontrollable trend. And then through some concrete examples the paper describes what kind of tasks can be solved by statistical language models (SLM), and why they always outperform their competitors under comparable evaluations. The comparable evaluation with uniform testing data and scoring method is a powerful lever for achieving progress of science and technology. Let's arm ourselves with such a weapon.
作者 黄昌宁
机构地区 微软亚洲研究院
出处 《语言文字应用》 CSSCI 北大核心 2002年第1期77-84,共8页 Applied Linguistics
关键词 中文信息处理 统计语言模型 Chinese information processing statistical language mode
  • 相关文献

参考文献8

  • 1黄昌宁.关于处理大规模真实文本的谈话[J].语言文字应用,1993(2):1-10. 被引量:25
  • 2Schank, R., and Abelson, R. Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale: Lawrence Erlbaum Associates, Publishers, 1977.
  • 3Rich, Elaine. Artificial Intelligence. London: McGraw-Hill Book Company, 1983,295--344.
  • 4In: Artificial Intelligence at MIT: Expending Frontiers, Vol.1. Winston, P. H., and Shellard, S.A. (eds.). Cambridge, Mass: MIT Press, 1990.
  • 5Garside, R., Leech, G. and Sampson, G. (eds.). The Computational Analysis of English: A Corpus-Based Approach. London: Longman, 1989.
  • 6夸克等.英语语法大全[M].华东师范大学出版社,1988.
  • 7白拴虎.汉语词性自动标注系统研究[D].清华大学计算机科学与技术系硕士学位论文,1992.
  • 8Collins, M. and Brooks, J. Preposition phrase attachment through a backed-off model. In: Proceedings of the 3rd WVLC, Cambridge, MA, 1995.

共引文献24

同被引文献234

引证文献31

二级引证文献188

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部