期刊文献+

卫星领域语料库构建与命名实体识别

Satellite domain corpus construction and named entity recognition
下载PDF
导出
摘要 针对卫星领域命名实体语料匮乏、现有算法识别性能较低的问题,提出一种考虑模糊边界的卫星领域实体标注方法,构建包含8类常见卫星领域实体的语料库,与该领域现有语料库相比粒度更细、覆盖更广,并以此为基础提出迁移学习和多网络融合的卫星领域实体识别算法。该算法采用预训练双向编码器对语料语义平滑迁移获得子词级别特征,采用双向长短期记忆(bi-directional long-short term memory,BiLSTM)神经网络捕捉上下文信息确定边界,以条件随机场作为解码器实现标签预测。实验结果表明:相比于BiLSTM等传统模型具有更优的识别性能,算法在8种实体上的F1值均在92%以上,微平均F1值达到96.10%。 Aiming at the lack of named entity corpus in the satellite domain and the low recognition performance of existing algorithms,a satellite domain entity labeling method considering fuzzy boundaries was proposed,constructed a corpus containing 8 common satellite domain entities where the granularity was finer and the coverage was wider in comparison with the existing corpora in this field.Based on this,a transfer learning and multi-network fusion satellite domain entity recognition algorithm was proposed.Algorithm used pretrained bidirectional encoder representations for transformers to smoothly transfer the semantics of the corpus for subword-level features,a BiLSTM(bi-directional long-short term memory)network for capturing contextual information to determine boundaries,and label prediction was achieved using a conditional random field as a decoder.Experimental results show that,compared with traditional models such as BiLSTM,the proposed algorithm has better recognition performance where the F1-score in 8 entities is all above 92%and the micro-average F1-score reaches 96.10%.
作者 徐聪 石会鹏 陈志敏 张鑫宇 王静 杨甲森 XU Cong;SHI Huipeng;CHEN Zhimin;ZHANG Xinyu;WANG Jing;YANG Jiasen(Key Laboratory of Electronics and Information Technology for Space Systems,National Space Science Center,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;The State Radio_monitoring_center Testing Center,Beijing 100041,China)
出处 《国防科技大学学报》 EI CAS CSCD 北大核心 2024年第4期175-183,共9页 Journal of National University of Defense Technology
基金 中国科学院复杂航天系统电子信息技术重点实验室择优基金资助项目(Y42613A32S)。
关键词 命名实体识别 迁移学习 神经网络 数据稀缺 name entity recognition transfer learning neural networks data scarcity
  • 相关文献

参考文献9

二级参考文献58

  • 1[16]Hobbs J,Appelt D,Bear J et al.FASTUS:A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text[C].In:Roche,Schabes eds. Finite State Devices for Natural Language Processing, MIT Press,Cambridge MA, 1996
  • 2[17]Appelt D E.Introduction to Information Extraction[J].AI COMMUNICATIONS, 1999; 12(3)
  • 3[18]Yangarber R.Scenario Customization for Information Extraction[D].Ph D Thesis.New York University,2001-01
  • 4[19]Cowie J, Lehnert W.Information Extraction[J].Communications of the ACM, 1996;39(1)
  • 5[20]Grishman R Adaptive information extraction and sublangu age analysis[C].In:Proceedings of IJCAI-2001 Workshop on Adaptive Text Extraction and Mining,2001
  • 6[1]Applet D E,Israel D J.Introduction to Information Extraction Technology. A Tutorial for IJCAI-99,1999
  • 7[2]Gaizauskas R,Wilks Y.Information Extraction:Beyond Document Retrieval[J].Journal of Documentation, 1997
  • 8[3]Sager N.Natural Language Information Processing. Reading,Massachusetts:Addison Wesley, 1981
  • 9[4]Dejong G.An Overview of the FRUMP System[C].In:LEHNERT W,RINGLE M h eds. Strategies for Natural Language Processing,Lawrence Erlbaum, 1982:149~176
  • 10[5]Grishman R,Sundheim B.Message Understanding Conference-6:A Brief History[C].In :Proceedings of the 16h International Conference on Computational Linguistics(COLING-96),1996-08

共引文献227

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部