期刊文献+

基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取

Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation
下载PDF
导出
摘要 关系抽取是梳理学科知识的重要手段以及构建教育知识图谱的重要步骤。在当前研究中,如BERT(Bidirectional Encoder Representations from Transformers)等以Transformer架构为基础的预训练语言模型多数存在参数量大、复杂度过高的问题,难以部署于终端设备,限制了在真实教育场景中的应用。此外,大多数传统的轻量级关系抽取模型并不是通过文本结构对数据进行建模,容易忽略实体间的结构信息;且生成的词嵌入向量难以捕捉文本的上下文特征、对一词多义问题解决能力差,难以契合学科知识文本非结构化以及专有名词占比大的特点,不利于高质量的关系抽取。针对上述问题,提出一种基于改进分段卷积神经网络(PCNN)和知识蒸馏(KD)的学科知识实体间关系抽取方法。首先,利用BERT生成高质量的领域文本词向量,改进PCNN模型的输入层,从而有效捕捉文本上下文特征并在一定程度上解决一词多义问题;其次,利用卷积和分段最大池化操作深入挖掘实体间结构信息,构建BERTPCNN模型,实现高质量的关系抽取;最后,考虑到教育场景对高效且轻量化模型的需求,蒸馏BERT-PCNN模型输出层和中间层知识,用于指导PCNN模型,完成KD-PCNN模型的构建。实验结果表明,BERT-PCNN模型的加权平均F1值达到94%,相较于R-BERT和EC_BERT模型分别提升了1和2个百分点;KD-PCNN模型的加权平均F1值达到92%,与EC_BERT模型持平;参数量相较于BERT-PCNN、KD-RB-l模型下降了3个数量级。可见,所提方法能在性能评价指标和网络参数量之间更好地权衡,有利于教育知识图谱自动化构建水平的提高和新型教育应用的研发与部署。 Relational extraction is an important means of sorting out discipline knowledge as well as an important step in the construction of educational knowledge graph.In the current research,most of the pre-trained language models based on the Transformer architecture,such as the Bidirectional Encoder Representations from Transformers(BERT),suffer from large number of parameters and excessive complexity,which make them difficult to be deployed on end devices and limite their applications in real educational scenarios.In addition,most traditional lightweight relation extraction models do not model the data through text structure,which are easy to ignore the structural information between entities,and the generated word embedding vectors are difficult to capture the contextual features of the text,have poor ability to solve the problem of multiple meanings of words,and are difficult to fit the unstructured nature of discipline knowledge texts and the high proportion of proper nouns,which is not conducive to high-quality relation extraction.In order to solve the above problems,a relation extraction method between discipline knowledge entities based on improved Piecewise Convolutional Neural Network(PCNN)and Knowledge Distillation(KD)was proposed.Firstly,BERT was used to generate high-quality domain text word vectors to improve the input layer of the PCNN model,so as to effectively capture the text context features and solve the problem of multiple meanings of words to a certain extent.Then,convolution and piecewise max pooling operations were utilized to deeply mine inter-entity structural information,constructing the BERT-PCNN model,and achieving high-quality relation extraction.Lastly,by taking into account the demands for efficient and lightweight models in educational scenarios,the knowledge of the output layer and middle layer of the BERT-PCNN model was distilled for guiding the PCNN model to complete the construction of the KD-PCNN model.The experimental results show that,the weighted-average F1 of the BERT-PCNN model reaches 94%,which is improved by 1 and 2 percentage points compared with the R-BERT and EC_BERT models;the weighted-average F1 of the KD-PCNN model reaches 92%,which is the same as the EC_BERT model,and the parameter quantity of the KD-PCNN model decreased by 3 orders of magnitude compared with the BERT-PCNN and KD-RB-l models.It can be seen that the proposed method can achieve a better trade-off between the performance evaluation index and the network parameter quantity,which is conducive to the improvement of the automated construction level of educational knowledge graph and the development and deployment of new educational applications.
作者 赵宇博 张丽萍 闫盛 侯敏 高茂 ZHAO Yubo;ZHANG Liping;YAN Sheng;HOU Min;GAO Mao(College of Computer Science and Technology,Inner Mongolia Normal University,Hohhot Inner Mongolia 010022,China)
出处 《计算机应用》 CSCD 北大核心 2024年第8期2421-2429,共9页 journal of Computer Applications
基金 内蒙古自然科学基金资助项目(2023LHMS06009) 内蒙古自治区教育科学研究“十四五”规划2023年度课题(2023NGHZX-ZH119,NGJGH2023234) 内蒙古师范大学研究生科研创新基金资助项目(CXJJS23067,CXJJS22137) 内蒙古师范大学基本科研业务费专项(2022JBXC018)。
关键词 关系抽取 分段卷积神经网络 知识蒸馏 知识图谱 学科知识 神经网络 relation extraction Piecewise Convolution Neural Network(PCNN) knowledge distillation knowledge graph discipline knowledge neural network
  • 相关文献

参考文献13

二级参考文献155

共引文献343

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部