摘要
对原始语料进行标注是汉语中介语语料库建设的重要工作。但目前语料库的标注方法仍存在诸多问题,如无法保留生料库标注、偏误标注信息不全、无法对正确语句进行标注、标注界限不清晰、代码不统一等等。使用XML标注方法可以有效解决以上问题,有利于逐步实现计算机自动化标注,推进标注体系不断完善,促进中介语语料库建设的可持续发展。
Original corpus tagging is an important task of Chinese Interlanguage Corpus building,but there are still corpus tagging methods can not preserve raw material library tagging,tagging errors incomplete information,the statement can not be labeled correctly,mark boundaries are not clear, the code is not uniform and many other issues,the use of XML tagging method can effectively solve the above problem,the progressive realization of computer automation tagging,tagging system to promote continuous improvement,sustainable development Interlanguage Corpus.
出处
《福建江夏学院学报》
2015年第6期99-105,共7页
Journal of Fujian Jiangxia University
基金
福建省中青年教师教育科研基金项目(JA13331S)
福建江夏学院青年科研人才培育基金项目(JXS2013018)