摘要
本论文中研究了基于词素的哈萨克语语言模型,过往的研究中主要研究了以单词或音节为单位建立语料库形成模型,而本研究中哈萨克语的单词分解为词干和词缀后形成词素,通过得到的词素来建立语言模型,该模型哈萨克语的单词切分,拼写错误检测,语言模型优化等语言处理研究中起了重要的作用,本实验结果表明,该语言模型对哈萨克语单词切分成词干和词缀有明显的效果,切分准确率达到了80%。
This thesis studies the model of the kazak language based on morphemes, past research mainly studied with the word or syllable corpus formation model is established for the unit, and in this study the kazakh of morpheme is formed after words into stems and affixes, through the morphemes to establish the language model, the model of the kazakh word segmentation, spelling error detection, language model optimization plays an important role in the study of language processing, the experimental results show that the language model to the kazakh words cut into stems and affixes have obvious effect, segmentation accuracy reached 80%.
出处
《电脑知识与技术》
2018年第4Z期189-191,共3页
Computer Knowledge and Technology
基金
国家自然科学基金项目(61462085)