期刊文献+

基于机器学习的技术术语识别研究综述 被引量:13

Review of Technology Term Recognition Studies Based on Machine Learning
原文传递
导出
摘要 【目的】梳理机器学习算法在技术术语识别中的应用现状与前景。【文献范围】在WOS核心库和CNKI数据库中,以“technology term*recognition”、“技术术语识别”为检索词检索文献,并延伸阅读相关算法文献,共筛选62篇代表性文献进行述评。【方法】类比命名实体识别研究,归纳机器学习在技术术语识别中的应用和区别,从算法分类、一般流程、现存问题和下游应用4个方面进行梳理,并展望未来的应用前景。【结果】应用算法可分为单一的统计机器学习、单一深度学习和两者结合的混合算法,应用最广泛的是两者结合的混合算法,主流的模型代表是BiLSTM-CRF模型,迁移学习是未来重要的研究方向。【局限】深度学习快速发展,混合模型不断涌现,所归纳的算法模型仅为应用较为广泛的算法,并未逐一列出。【结论】现有方法仍然有诸多待优化研究的问题,应加强细粒度的实体识别、特征表示方法、评估方法和开源工具包等方面的研究。 [Objective]This paper reviews the status quo and future directions of technology term recognition studies based on machine learning.[Coverage]We searched“technology term*recognition”in Chinese and English with the Web of Science and CNKI.Then,we expanded our search to include the relevant algorithms literature.A total of 62 representative papers were chosen for this review.[Methods]We summarized the application and differences of machine learning in technology term recognition,and then examined it from four prospects:the classification of algorithms,general procedures,the existing problems,and downstream applications.Finally,we discussed the development trends and future studies.[Results]The algorithms can be divided into single statistical machine learning,single deep learning and hybrid algorithms.The most widely used algorithm is the hybrid method,i.e.,the BiLSTM-CRF model.Transfer learning is an important research direction in the future.[Limitations]With the rapid progress of deep learning,hybrid models are constantly emerging,this paper only summarized the popular ones.[Conclusions]There are many issues needs to be addressed.In the future,research on fine-grained entity recognition,feature representation,evaluation and open source toolkits should be strengthened.
作者 胡雅敏 吴晓燕 陈方 Hu Yamin;Wu Xiaoyan;Chen Fang(Chengdu Library and Information Center,Chinese Academy of Sciences,Chengdu 610041,China;Department of Library,Information and Archives Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190,China)
出处 《数据分析与知识发现》 CSSCI CSCD 北大核心 2022年第2期7-17,共11页 Data Analysis and Knowledge Discovery
关键词 技术术语识别 机器学习 深度学习 Technology Term Recognition Machine Learning Deep Learning
  • 相关文献

参考文献30

二级参考文献345

共引文献521

同被引文献263

引证文献13

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部