摘要
藏文音节字检错是藏文文本校对的重要环节。文章通过分析现代藏文音节字,将藏文音节字分为规则音节字(遵循组件组合规则的藏文音节字)和不规则音节字(不遵循组件组合规则的音节字)两种。对规则音节字采用了音节字组件组合规则进行检错,对非规则音节字采用建立梵源藏文词典、音译藏文词典和本体非规则音节字词典进行检错。实验表明,文章提出的藏文音节字检错方法对报纸类藏文的检错率为100%。
Error detection of Tibetan syllable in Tibetan text is an important part for proofreading. Based on analysis of modern Tibetan syllables, Tibetan syllable is partitioned into regular syllable which follows componentcombination rule of syllable and irregular syllable which does not follow the component combination rule of syllable. For regular syllable we used component combination rule to check error, while for irregular syllable we established Sanskrit-source Tibetan dictionary, transliteration Tibetan dictionary and ontology irregular syllable words dictionary to detect error. The efficiency of our error detection method achieved 100% as evaluating Tibetan syllable in newspapers.
基金
2016年度西藏高校青年教师创新支持计划项目"藏文网页关键信息抽取技术研究"(项目号:QCZ2016-13)
2016年度西藏高校青年教师创新支持计划项目"现代藏文音节字表构建与音节字构件的识别研究"(项目号:QCZ2016-11)
2015年度珠峰学者人才发展支持计划--青年骨干教师项目阶段性成果
关键词
藏文音节字
音节字组件
组合规则
检错
Tibetan syllable words
Components of Syllable words
Combination rule
Error detection