摘要
合同文本的智能化处理已成为企业信息化的一个重要需求。针对合同文本存在的凌乱、碎片化和无规则的问题,本文提出了基于深度学习的合同分类模型及要素抽取模型。合同分类从标题分类和文本分类两个方向展开研究,提出了基于注意力机制的BiLSTM模型进行标题分类,基于改进的HAN深度学习模型进行文本分类,有效地提升了文本分类的准确性;针对存在的合同信息难以抽取的问题,提出基于BiLSTM-CRF深度学习模型识别合同要素,以准确获取合同要素信息。实验表明,本文提出模型能够很好地应用在合同文本处理中,能够提升分类和要素抽取的性能。
The intelligent processing of contract text has become an important demand for enterprise informatization.In response to the problems of messy,fragmented and irregular contract texts,this paper proposes a contract classification model and element extraction model based on deep learning.Contract classification is studied from two directions:title classification and text classification.The BiLSTM model based on attention mechanism is proposed for title classification and the improved HAN model is proposed for text classification,which effectively improve the accuracy of text classification.To address the existing problem that contract information is difficult to extract,the BiLSTM-CRF model is proposed to identify contract elements to accurately obtain contract element information.Experiments show that the proposed model can be well applied in contract text processing and can improve the performance of classification and element extraction.
作者
张晓芳
欧睿
饶攀军
郑元
张雷
陈科
周郴莲
王浩畅
赵铁军
ZHANG Xiaofang;OU Rui;RAO Panjun;ZHENG Yuan;ZHANG Lei;CHEN Ke;ZHOU Chenlian;WANG Haochang;ZHAO Tiejun(Taiji Computer Co.,Ltd.,Beijing 100083,China;School of Computer and Information Technology,Northeast Petroleum University,Daqing Heilongjiang 163318,China;School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)
出处
《智能计算机与应用》
2022年第8期123-128,共6页
Intelligent Computer and Applications
基金
太极计算机股份有限公司项目(WBXM202101009)。
关键词
合同文本
文本分类
要素抽取
深度学习
contract text
text classification
element extraction
deep learning