期刊文献+

基于Transformer与技术词信息的知识产权实体识别方法 被引量:1

An intellectual property entity recognition method based on Transformer and technological word information
下载PDF
导出
摘要 专利文本中包含了大量实体信息,通过命名实体识别可以从中抽取包含关键信息的知识产权实体信息,帮助研究人员更快了解专利内容。现有的命名实体提取方法难以充分利用专业词汇变化带来的词层面的语义信息。本文提出基于Transformer和技术词信息的知识产权实体提取方法,结合BERT语言方法提供精准的字向量表示,并在字向量生成过程中,加入利用字向量经迭代膨胀卷积网络提取的技术词信息,提高对知识产权实体的表征能力。最后使用引入相对位置编码的Transformer编码器,从字向量序列中学习文本的深层语义信息,并实现实体标签预测。在公开数据集和标注的专利数据集的实验结果表明,该方法提升了实体识别的准确性。 Patent text contains abundant entity information, from which the intellectual property(IP) entity information containing key information can be extracted through named entity recognition, which helps researchers understand patent content faster. For the existing named entity extraction method, the semantic information at the word level brought by a change in technical words is difficult to fully use. In this paper, the IP entity information extraction method based on Transformer and technical word information is proposed, which provides exact word vector representation based on the BERT language model. In the process of word vector generation, this method improves the representation ability of IP entities by adding the technical word information extracted by iterated dilated convolution neural network. Finally,the Transformer encoder with relative position coding is used to learn the deep semantic information of the text from the word vector sequence, realizing the prediction of the entity label. Experimental results on public and annotated patent datasets show that this method improves entity recognition accuracy.
作者 王宇晖 杜军平 邵蓥侠 WANG Yuhui;DU Junping;SHAO Yingxia(School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China;Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处 《智能系统学报》 CSCD 北大核心 2023年第1期186-193,共8页 CAAI Transactions on Intelligent Systems
基金 国家重点研发计划项目(2018YFB1402600) 国家自然科学基金项目(61772083)。
关键词 中文命名实体识别 知识产权 Transformer编码器 信息融合 向量表示 科技大数据 专利 深度学习 entity recognition named in Chinese intellectual property Transformer encoder information fusion vector representation science and technology big data patent deep learning
  • 相关文献

参考文献2

二级参考文献6

共引文献13

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部