期刊文献+

基于Web知识的中文分词结果优化 被引量:6

OPTIMISING CHINESE WORD SEGMENTATION BASED ON WEB KNOWLEDGE
下载PDF
导出
摘要 随着人们在互联网上的活动越来越频繁,网络新词不断涌现。现有的中文分词系统对新词的识别效率并不高。对新词的识别效率直接影响分词的精度,也对互联网应用系统的服务质量产生影响。在分词系统分词结果的基础上,提出利用搜索引擎和百度百科等Web知识,结合统计和匹配实现新词识别的方法,进一步实现对系统原始分词结果的优化。实验数据表明,该方法能够有效识别网络新词并实现分词结果的优化。 As people's activities on the Internet become more and more frequent,the new words on the web are constantly emerging. The recognition efficiency of existing Chinese word segmentation system is relatively low on new words. The identification efficiency on new words directly impacts the precision of word segment,as well as affects the services quality of internet applications. Based on the segmentation results of current word segmentation system,we propose an approach for implementing the new words recognition by using Web knowledge such as search engine and Baidupedia and combining the statistics and matching,which further realises the optimisation of primitive segmentation results of the system. Experimental data show that the proposed method can effectively identify the new Web words and achieves the optimisation of segmentation results.
出处 《计算机应用与软件》 CSCD 2015年第12期55-58,共4页 Computer Applications and Software
关键词 中文分词 未登录词 网络新词 搜索引擎 分词优化 Chinese word segmentation Unknown word New Web word Search engine Word segmentation optimisation
  • 相关文献

参考文献8

二级参考文献66

共引文献58

同被引文献44

引证文献6

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部