期刊文献+

基于BiLSTM-CRF的司法领域实体识别研究 被引量:1

Research on Named Entity Recognition in Legal Documents Based on BiLSTM-CRF
下载PDF
导出
摘要 司法信息自动化是司法领域发展的必然趋势,而司法实体识别是实现司法信息自动化的基础,是后续实现司法事件抽取,构建司法领域知识图谱的必要前提,具有重要的研究意义。目前,随着自然语言处理技术的不断发展,实体识别领域的研究也越来越成熟,但由于中文字符的特殊性以及司法领域对准确性要求非常高等原因,面向司法领域的实体识别研究比较少。对此,提出一种基于深度学习的模型来自动识别裁判文书中的实体,该模型由双向长短期记忆模型(BiLSTM)和条件随机场模块(CRF)组成,将该模型称为BiLSTM-CRF,为了进一步提升模型实体识别的准确率,提出使用Adam优化器对模型进行优化。使用从裁判文书网上获取的减刑案件、假释案件及暂予监外执行案件的裁判文书作为数据集对该模型进行验证。在对比实验中首先将该模型的实验结果与其他实体识别模型进行对比,然后使用不同优化算法优化模型以证明Adam优化器的有效性。实验表明,带Adam优化器的BiLSTM-CRF模型在数据集上能够取得最优的结果,准确率为0.876,召回率为0.858,F1值为0.855。实验结果证明带Adam优化器的BiLSTM-CRF模型在司法领域实体识别上的可行性。 In the domain of judicial,information automation is an inevitable development trend.However,named entity recognition is the basis of ju⁃dicial information automation,and the necessary prerequisite for the subsequent extraction of judicial events and the construction of judi⁃cial knowledge graph,which is of great research significance.Due to the particularity of Chinese characters and the high requirement of accuracy in the domain of judicial,there are only a few researches on named entity recognition.To this end,we propose a model based on deep learning to automatically recognize the entities in the legal documents.This model is based on Bi-directional Long Term Short-Term Memory(BiLSTM)and Conditional Random Fields(CRF),which is called BiLSTM-CRF.In this paper,we utilize the dataset of commuta⁃tion,parole and temporary service outside prison documents obtained from the online judgment documents to train the network,and some optimization algorithms are adopted to optimize the model,we then compare the results of different optimization algorithms.After that,we also compare the result with other model using the same dataset.Experimental results show that the proposed BiLSTM-CRF with Adam optimizer outperforms other methods on the dataset,the BiLSTM-CRF method achieves the accuracy of 0.876,recall rate of 0.858 and F1 value of 0.855,proving the feasibility of the proposed method in named entity recognition of the judicial field.
作者 杨品莉 谢志长 YANG Pin-li;XIE Zhi-chang(College of Electronic Information,Sichuan University,Chengdu 610065)
出处 《现代计算机》 2020年第25期3-8,共6页 Modern Computer
基金 国家重点研发计划:智慧司法典型应用协同示范及综合评价(No.2018YFC0832300)。
关键词 双向长短期记忆模型(BiLSTM) 条件随机场(CRF) 实体识别 司法实体识别 优化器 Bi-directional Long Term Short-Term Memory(BiLSTM) Conditional Random Fields(CRF) Named Entity Recognition Named Entity Recognition in the Domain of Judicial Optimizer
  • 相关文献

参考文献5

二级参考文献39

  • 1张向喆,王明辉,赵洪波,王起山,潘玉春.生物医学文本中命名实体识别研究[J].上海交通大学学报(农业科学版),2010,28(2):132-139. 被引量:6
  • 2刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 3周强.规则和统计相结合的汉语词类标注方法[J].中文信息学报,1995,9(3):1-10. 被引量:43
  • 4周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112
  • 5赵铁军.机器翻译原理[M].哈尔滨:哈尔滨工业大学出版社,2001..
  • 6Volk Martin, Clematide Simon. Learn-filter-apply-forget mixed approaches to named entity recognition [C]. In: Proc of the 6th Int'l Workshop on Applications of Natural Language for Information Systems. Berlin: Springer, 2001. 153-163.
  • 7Y Z Wu, J Zhao, B Xu. Chinese named entity based on multiple features [C]. Human Language Technology Conference and Conf on Empirical Methods in Natural Language Processing (EMNLP-2005), Vancouver, Canada, 2005.
  • 8H P Zhang, Q Liu, H Zhang, et al. Automatic recognition of Chinese unknown words based on roles tagging [C]. SigHan2002 Workshop Attached with the 19th Int'l Conf on Computational Linguistics, Taipei, 2002.
  • 9O Bender, F J Och, H Ney. Maximum entropy models for named entity recognition [C]. The 7th Conf on Computational Natural Language Learning (CoNLL 2003), Edmonton, Canada, 2003.
  • 10H L Chieu, H T Ng. Named entity recognition with a maximum entropy approach [C]. The 7th Conf on Computational Natural Language Learning (CoNLL 2003), Edmonton, Canada, 2003.

共引文献105

同被引文献29

引证文献1

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部