摘要
从大量非结构化的企业文本中抽取出结构化的企业关系,是建立企业知识图谱的基础工作。循环神经网络(RNN)和卷积神经网络(CNN)是当前关系抽取的主要方法。但是由于企业文本语法特征复杂,长程依赖明显,所以采用RNN的变形网络Bi-GRU来进行初步提取。Bi-GRU虽然考虑了长距离词的相关性,但提取特征不够充分。所以在已有基础上引入Self-Attention,使模型能进一步计算每个词的长程依赖特征,提高模型的特征表达能力。最后通过各种模型的实验比较,该方法相较只含Bi-GRU或其他经典模型,在企业文本的关系抽取性能有进一步提高。
Recurrent Neural Network(RNN)and convolutional neural network(CNN)are the main methods of relation extraction.However,due to the complex features of corporate text grammar and obvious long-term dependence,the deformation network Bi-GRU of RNN is adopted.Although the correlation of long-distance words is considered in bi-gru,the feature extraction is insufficient.Therefore,self-attention is introduced on the existing basis to enable the model to further calculate the long-term dependent characteristics of each word and improve the feature expression ability of the model.
出处
《工业控制计算机》
2020年第4期108-110,113,共4页
Industrial Control Computer