摘要
文中拼音汉字转换在中文处理中有诸多应用。文中提出基于概率潜在语义获取拼音汉字转换过程之中文本所存在的潜在语义知识,从而将长距离的语义知识与拼音汉字转换模型相结合,提升汉字转换准确率;同时在实验中研究引入更多文本知识增强模型性能,优化了拼音汉字转换模型应用效果。
In Chinese natural language processing,there are a lot of applications for Pinyin-to-Chinesecharacter conversion. This paper proposed using the PLSA to integrate the long-distance semantic knowledge with the Pinyin-to-Chinese-character conversion model,which improves the effect of the conversion accuracy; furthermore,some other contextual information is integrated with the model,which helps improving the conversion result.
作者
郑叶清
刘功申
ZHENG Ye-qing LIU Gong-shen(School of Electronic Inform ation and Electrical E ngineering,Shanghai Jiaotong U niversity,Shanghai 200240,Chin)
出处
《信息技术》
2016年第11期33-37,41,共6页
Information Technology
基金
973计划(2013CB329603)
国家自然科学基金项目(61472248
61171173)
关键词
概率潜在语义分析
拼音汉字转换
统计语言模型
probabilistic latent semantic analysis
Pinyin-to-Chinese-character conversion
statistics language model