期刊文献+

基于Transformer的司法文书命名实体识别方法

Named Entity Recognition Approach of Judicial Documents Based on Transformer
下载PDF
导出
摘要 命名实体识别是自然语言处理领域的关键任务之一,是实现下游任务的基础。目前针对司法领域的相关研究相对较少,司法系统的信息化和智能化转型仍有许多问题亟需解决。相比其他领域的文本,司法文书存在专业性强、语料资源少等局限,导致现有的司法文书识别结果较低。因此,从以下3方面开展研究:首先,提出了一种多标签层级迭代的文本标注方式,可以对原始司法文书文本进行自动化标注,同时有效地提升司法文书命名实体识别任务的实体识别效果;其次,提出了一种交融式的Transformer神经网络模型,对汉字固有属性的深层特征进行了充分利用,用于对司法文书进行命名实体识别;最后,对所提出的标注方法和模型与其他神经网络模型进行了对比实验。所提出的文本标注方式可以较为准确地实现司法文书的标注任务;同时,所提出的模型在通用数据集中相对于对照模型有较大的提高,并在司法领域数据集中取得了良好的效果。 Named entity recognition is one of the key tasks in the field of natural language processing,and it is the foundation of downstream tasks.At present,there are relatively few research results on the judicial field,and there are still many problems need to be solved in the informatization and intelligent transformation of the judicial system.Compared with texts in other fields,judicial documents have limitations such as strong professionalism and few corpus resources,leading to low recognition results of existing judicial documents.Therefore,the research is carried out from the following three aspects.Firstly,a multi-label hierarchical iterative annotation method(ML-HIA)is proposed,which can automatically annotate the original judicial documents and effectively improve the effect of the entity recognition task of judicial documents.Secondly,an feature mixed Transformer(FM-Transformer)neural network model,which makes full use of the deep features of the inherent attributes of Chinese characters,is proposed to identify named entities of judicial documents.Finally,the proposed method and model are compared with other neural network models.The proposed method of text annotation can realize the task of judicial document annotation accurately.At the same time,compared with other models,the proposed model has a great improvement in the general dataset,and has achieved good results in the judicial datasets.
作者 王颖洁 张程烨 白凤波 汪祖民 WANG Yingjie;ZHANG Chengye;BAI Fengbo;WANG Zumin(College of Information Engineering,Dalian University,Dalian 116622,China;School of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China)
出处 《计算机科学》 CSCD 北大核心 2024年第S01期113-121,共9页 Computer Science
关键词 自然语言处理 数据标注 Transformer模型 深度学习 司法信息化 Natural language processing Data annotation Transformer model Deep learning Judicial informatization
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部