摘要
司法信息自动化是司法领域发展的必然趋势,而司法实体识别是实现司法信息自动化的基础,是后续实现司法事件抽取,构建司法领域知识图谱的必要前提,具有重要的研究意义。目前,随着自然语言处理技术的不断发展,实体识别领域的研究也越来越成熟,但由于中文字符的特殊性以及司法领域对准确性要求非常高等原因,面向司法领域的实体识别研究比较少。对此,提出一种基于深度学习的模型来自动识别裁判文书中的实体,该模型由双向长短期记忆模型(BiLSTM)和条件随机场模块(CRF)组成,将该模型称为BiLSTM-CRF,为了进一步提升模型实体识别的准确率,提出使用Adam优化器对模型进行优化。使用从裁判文书网上获取的减刑案件、假释案件及暂予监外执行案件的裁判文书作为数据集对该模型进行验证。在对比实验中首先将该模型的实验结果与其他实体识别模型进行对比,然后使用不同优化算法优化模型以证明Adam优化器的有效性。实验表明,带Adam优化器的BiLSTM-CRF模型在数据集上能够取得最优的结果,准确率为0.876,召回率为0.858,F1值为0.855。实验结果证明带Adam优化器的BiLSTM-CRF模型在司法领域实体识别上的可行性。
In the domain of judicial,information automation is an inevitable development trend.However,named entity recognition is the basis of ju⁃dicial information automation,and the necessary prerequisite for the subsequent extraction of judicial events and the construction of judi⁃cial knowledge graph,which is of great research significance.Due to the particularity of Chinese characters and the high requirement of accuracy in the domain of judicial,there are only a few researches on named entity recognition.To this end,we propose a model based on deep learning to automatically recognize the entities in the legal documents.This model is based on Bi-directional Long Term Short-Term Memory(BiLSTM)and Conditional Random Fields(CRF),which is called BiLSTM-CRF.In this paper,we utilize the dataset of commuta⁃tion,parole and temporary service outside prison documents obtained from the online judgment documents to train the network,and some optimization algorithms are adopted to optimize the model,we then compare the results of different optimization algorithms.After that,we also compare the result with other model using the same dataset.Experimental results show that the proposed BiLSTM-CRF with Adam optimizer outperforms other methods on the dataset,the BiLSTM-CRF method achieves the accuracy of 0.876,recall rate of 0.858 and F1 value of 0.855,proving the feasibility of the proposed method in named entity recognition of the judicial field.
作者
杨品莉
谢志长
YANG Pin-li;XIE Zhi-chang(College of Electronic Information,Sichuan University,Chengdu 610065)
出处
《现代计算机》
2020年第25期3-8,共6页
Modern Computer
基金
国家重点研发计划:智慧司法典型应用协同示范及综合评价(No.2018YFC0832300)。
关键词
双向长短期记忆模型(BiLSTM)
条件随机场(CRF)
实体识别
司法实体识别
优化器
Bi-directional Long Term Short-Term Memory(BiLSTM)
Conditional Random Fields(CRF)
Named Entity Recognition
Named Entity Recognition in the Domain of Judicial
Optimizer