摘要
本文提出了按照语料库统计的理论对《元朝秘史》进行计算机自动校勘的方法,并列举了相比人工校勘的优越性。通过计算机自动校勘,可以发现大量人工校勘很难发现的错误,通过设置阀值和置信区间,可以控制让计算机自动更正错误或者做标记,计算机不能判断之处标记出来留待人工审核。使用该方法可以节省大量的人工校验时间,是一种处理复杂文本的较好方法。
This paper introduces an approach that applies corpus statistical theory to enable computer-aided automatic emending of The Secret History of Mongols. It also presents its advantages over manual emending. Through computer automatic emending, it is easier to find errors and ambiguities that can hardly be identified by manual emending. After setting up critical confidence value and confidence interval, computer can correct errors automatically or simply mark up the ambiguities. Those marked ambiguities can be retained for manual emending. This approach saves lots of manual emending work, which proves to be an improved approach to process complex texts.
出处
《语言文字应用》
CSSCI
北大核心
2007年第3期136-142,共7页
Applied Linguistics
基金
中国社会科学院重点实验室资助项目(语音学与计算语言学:MZ101)
关键词
元朝秘史
校勘
误抄类型
相似度
The Secret History of Mongols
emending
error type
similarity