期刊文献+

中文分词技术研究进展综述 被引量:7

A Summary of the Research Progress of Chinese Word Segmentation Technology
下载PDF
导出
摘要 中文分词作为实现机器处理中文的一项基础任务,是近几年的研究热点之一。其结果对后续处理任务具有深远影响,具备充分的研究意义。通过对近5年分词技术研究文献的综合分析,明晰后续研究将以基于神经网络模型的融合方法为主导,进一步追求更精准高效的分词表现。而在分词技术的发展与普及应用中,亦存在着制约其性能的各项瓶颈。除传统的歧义和未登录词问题外,分词还面临着语料规模质量依赖和多领域分词等新难题,针对这些新问题的突破研究将成为后续研究的重点之一。 As a basic task of machine processing, Chinese word segmentation is one of the research hotspots in recent years. The results have a far-reaching impact on the follow-up processing tasks, and are of full research significance. Through the comprehensive analysis of the research literature on word segmentation technology in the past five years, it is clear that the follow-up research will be dominated by the fusion method based on neural network model, and further pursue more accurate and efficient word segmentation performance. In the development and application of word segmentation technology, there are also various bottlenecks restricting its performance. In addition to the traditional ambiguity and unknown words, word segmentation is now faced with new problems such as corpus scale and quality dependence and multi-domain word segmentation. The breakthrough research on these new problems will become one of the focuses of the follow-up research.
作者 钟昕妤 李燕 ZHONG Xin-yu;LI Yan(School of Information Engineering,Gansu University of Traditional Chinese Medicine,Lanzhou 730101,China)
出处 《软件导刊》 2023年第2期225-230,共6页 Software Guide
基金 甘肃中医药大学研究生创新基金项目(2022CX137)。
关键词 中文分词 深度学习 语料依赖 多领域分词 Chinese word segmentation deep learning corpus dependence multi-domain participle
  • 相关文献

参考文献20

二级参考文献168

共引文献168

同被引文献90

引证文献7

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部