摘要
该文提出了"基于互联网自然标注资源的自然语言处理"的学术思想,并从自然标注资源的定义和基本类型、基于自然标注资源的计算、方法论层面上的初步思考等三个角度对这一学术思想进行了初步的阐发。最后指出了其中的一个基础问题:如果我们把全部自然标注资源所能提供的全部信息或知识都以一种系统的方式用到了极致,并且把它们最大限度地有机集成起来,机器能否如愿以偿地获得对自然语言一定深度的理解呢?
This article proposes an idea of "natural language processing based on naturally annotated Web resources".The discussion is carried out from three perspectives: the definition and types of naturally annotated resources,naturally annotated resource-based computing,as well as several key points concerned at the methodological level.A fundamental problem is presented for further exploration at last: If we could explore and integrate all the information provided by all the available naturally annotated resourcesin different respectssystematically,can themachine,as expected,ultimatelyachieve some degree of deep understanding of naturallanguage?
出处
《中文信息学报》
CSCD
北大核心
2011年第6期26-32,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60873174)
关键词
自然标注资源
用户产生数据
互联网
自然语言处理
naturally annotated resource
User generated data
web
natural language processing