摘要
命名实体识别作为自然语言处理中的一项核心任务,在信息抽取、问答系统、机器翻译等方面应用广泛。首先,对基于规则和词典、基于统计机器学习的方法进行了描述和总结。其次,综述了基于深度学习中有监督、远程监督和Transformer的命名实体识别模型,特别对近年来在自然语言处理领域中热门的Transformer架构及其相关模型进行了阐述,包括基于Transformer的掩码语言建模和自回归语言建模,如BERT、T5和GPT等。再次,简要探讨了应用于命名实体识别中基于数据的迁移学习和基于模型的迁移学习方法。最后,总结了命名实体识别任务面临的挑战和未来的发展趋势。
Named entity recognition(NER),as a core task in natural language processing,finds extensive applications in information extraction,question answering systems,machine translation,and more.Firstly,descriptions and summaries are provided for rule-based,dictionary-based,and statistical machine learning methods.Subsequently,an overview of NER models based on deep learning,including supervised,distant supervision,and Transformer-based approaches,is presented.Particularly,recent advancements in Transformer architecture and its related models in the field of natural language processing are elucidated,such as Transformer-based masked language modeling and autoregressive language modeling,including BERT,T5,and GPT.Furthermore,brief discussions are conducted on data transfer learning and model transfer learning methods applied to NER.Finally,challenges faced by NER tasks and future development trends are summarized.
作者
丁建平
李卫军
刘雪洋
陈旭
DING Jian-ping;LI Wei-jun;LIU Xue-yang;CHEN Xu(School of Computer Science and Engineering,North Minzu University,Yinchuan 750021,China)
出处
《计算机工程与科学》
CSCD
北大核心
2024年第7期1296-1310,共15页
Computer Engineering & Science
基金
国家自然科学基金(62066038,61962001)
中央高校基本科研业务费(2019KYQD04,2021JCYJ12,2022PT_S04)
宁夏自然科学基金(2021AAC03215)。
关键词
命名实体识别
机器学习
深度学习
迁移学习
自然语言处理
named entity recognition
machine learning
deep learning
transfer learning
natural language processing