期刊文献+

地理信息服务领域的实体自动化识别 被引量:1

Automatic Entity Recognition of Geographic Information Service Document
下载PDF
导出
摘要 针对地理信息服务领域(Geographic Information Services,GIServices)在实体自动识别方面存在缺乏语料、多种实体嵌套、语义稀疏等问题,本文设计了一套地理信息服务文献实体标注规范,构建了地理信息服务领域的语料;在传统实体识别模型BiLSTM-CRF的基础上,引入了BERT(Bidirectional Encoder Representaions from Transformers)预训练模型及卷积层(Convolutional layer),构建了BERT-1DCNN-BiLSTM-CRF模型,提升了地理信息服务文献实体识别的准确率.该模型在词嵌入层以BERT预训练模型取代了传统的静态语言模型,从而有效地解决了地理信息服务领域因缺乏大量训练语料而无法表达更丰富句子语义信息的问题;此外,在BERT模型之后还加入了字间卷积特征,提升了句子局部特征的表示能力,降低了句子语义稀疏的干扰.实验结果表明,融合了BERT模型与CNN模型的GIServices文献实体识别方法效果优于传统深度学习的方法,模型准确率达到了0.8268,能够较好地实现GIServices文献自动化实体识别,同时也能较好地体现基于BERT的深度学习模型在实体自动化识别方面的有效性. In order to solve the problems in the field of geographic information services(GIServices),such as lack of corpus,nesting of multiple entities,and semantic sparser,etc.,in our report,a set of document entity labeling specifications for geographic information services was designed and the corpus in this field was constructed.Based on the traditional entity recognition model BiLSTM-CRF,the BERT(Bidirectional Encoder Representations from Transformers)pre-training model and convolutional layer were introduced,the BERT-1 DCNN-BiLSTM-CRF model was proposed to improve the accuracy of the document entity recognition in geographic information services.In the word embedding layer of this model,the traditional static language model was replaced by the BERT pre-training model,which can solve the problems that the geographic information service field lacks a large amount of training corpus and cannot represent richer sentence semantic information.Additionally,the word volume features was added to the BERT model to improve the ability to express local features of sentences and reduce the interference of sentence semantic sparseness.The results showed that the GIServices document entity recognition method,which integrates BERT and CNN model,is better than the traditional deep learning method,whose accuracy is as high as 0.8268.It can realize the automatic entity recognition of GIServices documents effectively,which extends the potential application of deep learning model based on BERT in GIServices domain entity recognition.
作者 独凌子 肖桂荣 Du Lingzi;Xiao Guirong(Key Lab of Spatial Data Mining and Information Sharing of Ministry of Education,Academy of Digital China(Fujian),Fuzhou University,Fuzhou 350108,China)
出处 《海南大学学报(自然科学版)》 CAS 2021年第4期331-339,共9页 Natural Science Journal of Hainan University
基金 中国科学院战略性先导科技专题课题(XDA23100504) 中央引导地方科技发展专项(2020L3005)。
关键词 地理信息服务 BERT模型 命名实体识别 字间特征卷积 BiLSTM-CRF模型 geographic information service BERT named entity recognition feature convolution between words
  • 相关文献

参考文献7

二级参考文献54

  • 1张晓艳,王挺,陈火旺.命名实体识别研究[J].计算机科学,2005,32(4):44-48. 被引量:65
  • 2王娟,慈林林,姚康泽.特征选择方法综述[J].计算机工程与科学,2005,27(12):68-71. 被引量:64
  • 3周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112
  • 4Grishman R,Sundheim B.Message Understanding Conference-6:A Brief History[C]//Proceedings of the 16th International Conference on Computational Linguistics.1996:466-471.
  • 5Beth M Sundheim.Named entity task definition,version 2.1[C]//Proceedings of the Sixth Message Understanding Conference,1995:219-332.
  • 6MUC[EB/OL]:http://www-nlpir.nist.gov/related_projects/muc/.
  • 7命名实体识别评测组.2004年命名实体评测大纲[OL].http://www.863data.com.cn.
  • 8沈达阳,孙茂松,黄昌宁.中国地名的自动辨识[J].计算机语言发展与应用,1995(10):68-76.
  • 9Manoranjan Dash,Huan.Selection for Classification[J].Intelligent Data Analysis,1997,1(3):131-156.
  • 10Cho H C,Okazaki N,Miwa M,et al.Named entity recognition with multiple segment representations[J].Information Processing&Management,2013,49(4):954-965.

共引文献169

同被引文献44

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部