摘要
国外自20世纪末开始对话行为语料库建设及相关研究,取得了重要进展,但国内对这类语料库了解甚少,与此相关的研究才刚刚起步。本文通过梳理国际知名的对话语料库(Switchboard、TRAINS、Verbmobil、ICSI-MR、AMI),阐述对话语料库在语料采集和对话行为标注等方面的特点以及面向自然语言处理领域的广泛应用,指出其在人工标注、术语使用等方面的问题,提出国内应借鉴国外的经验,将中国英语学习者口语语料库纳入到对话行为的框架下,以拓展语言研究的广度和深度。
Since the end of the 20th century,dialogue act corpora have been built in foreign countries and the related research has also made crucial progress.Such corpora,however,are not adequately addressed in China and our domestic research has just begun.By introducing the world-renowned dialogue corpora such as Switchboard,TRAINS,Verbmobil,ICSI-MR and AMI,this paper expounds the characteristics of dialogue corpora in data collection and dialogue act annotation as well as its extensive applications in the field of natural language processing.It also points out the limitations of manual annotation,inconsistent terminology,and other problems.It suggests that foreign experience should be learnt in order to bring the spoken corpora of Chinese English learners into the framework of dialogue act to expand the scope of language research.
作者
李艳娇
LI Yanjiao(School of Culture and Communication,Shandong University,Weihai 264209,P.R.China)
出处
《外国语言文学》
2023年第4期3-14,133,共13页
Foreign Language and Literature Studies
基金
国家社会科学基金青年项目“汉语会话行为标注及自动识别研究”(20CYY021)。