期刊文献+

基于混合模式的藏文音节切分

Tibetan Syllable Segmentation Based on Mixed Mode
下载PDF
导出
摘要 通过藏文格助词的接续、结构以及上下文特征,提出基于规则、支持向量机、还原法等三层混合模式的藏文音节切分方法.藏文音节切分是藏文字频统计、分词、词性标注和机器翻译等研究领域的基础,其中藏文紧缩格歧义现象的正确识别、切分和还原是藏文音节切分的难点.经实验,混合模式藏文音节切分的F值为99.97%. A Tibetan syllable segmentation method based on mixed mode of rules,support vector machine,restoration method was proposed through the analysis of case-auxiliary words and contextual features of Tibetan in this paper.The Tibetan syllable segmentation is the basis of many research fields such as Tibetan character frequency statistics,word segmentation,part-of-speech tagging and machine translation.Moreover,the correct identification,segmentation and restoration of Tibetan ambiguity case-auxiliary words are difficult points in Tibetan syllable segmentation.The experiment result showed that the F-measure score of 99.97%was obtained by using mixed mode Tibetan syllable segmentation.
作者 才让当知 华却才让 却措卓玛 夏吾吉 Cairangdangzhi;Huaquecairang;Quezuozhuoma;XIA Wu-ji(The Com puter College of Qinghai Normal University,Xining 810016,China;Tibetan Information Processing and Machine Translation Key Laboratory of Qinghai Province,Xining 810008,China;Key Laboratory of Tibetan Information Processing,Ministry of Education,Xining 810008,China)
出处 《内蒙古师范大学学报(自然科学汉文版)》 CAS 2019年第5期406-412,共7页 Journal of Inner Mongolia Normal University(Natural Science Edition)
基金 国家社科基金资助项目(17XYY030) 青海省科技计划项目(2017-GX-146) 青海师范大学中青年科研基金项目(17ZR11) 青海省重点实验室项目(2013-Z-Y17,2014-Z-Y32,2015-Z-Y03) 藏文信息处理与机器翻译重点实验室(2013-Y-17)
关键词 音节特征 紧缩格 歧义紧缩格 支持向量机 syllable characteristic abbreviated case-auxiliary words ambiguity abbreviated case-auxiliary words SVM
  • 相关文献

参考文献20

二级参考文献136

共引文献185

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部