期刊文献+

面向技术识别的专利实体抽取--以类脑智能领域为例

Patent Entity Extraction for Technology Recognition:A Case Study of Brain-Inspired Intelligence
下载PDF
导出
摘要 [研究目的]专利实体抽取是基于专利文本的技术识别的基础。目前专利实体抽取任务面临自动化程度和准确率较低等问题,该研究从两方面对此进行改进:一是建立特定领域的高质量专利语料库,二是将先进的算法模型运用到专利实体抽取中。[研究方法]定义了包含13种实体类型的细粒度信息体系,并据此对921篇类脑智能专利的标题和摘要进行人工标注,此后运用Bert-BiLSTM-CRF模型,融合深度学习和机器学习对类脑智能专利实体进行识别。[研究结论]模型在总体上获得0.8的准确率、召回率和F1值,不同类型实体的识别效果具有差异。为了验证模型的性能,设计了几个对比实验。结果显示,微调数据和增加训练规模可以提高模型性能,本模型性能优于同时期一些经典模型。 [Research purpose]Patent entity extraction is the basis of technology recognition from patent texts.At present,patent entity extraction is faced with the problem of low automation and accuracy.This study intended to improve this problem from two aspects:one is to establish a high-quality patent corpus in a specific field,and the other is to apply an advanced algorithm model to patent entity extraction.[Research method]In this regard,a fine-grained information system was defined which contained 13 entity types and the titles and abstracts of 921 patents in the field of brain-inspired intelligence were manually marked according to the annotation rules.Then a Bert-BiLSTM-CRF model which integrates deep learning and machine learning was used to identify the brain-inspired intelligence patent entities.[Research conclusion]The model achieved accuracy rate,recall rate and F1 value of 0.8 on the whole and entities performed differently according to their types.In order to verify the performance of the model,several comparative experiments were designed.The results showed that fine-tuning data and increasing training scale could improve the performance of the model.Moreover,the model is superior to some classical models during the same period.
作者 邢晓昭 苑朋彬 陈亮 任亮 余池 Xing Xiaozhao;Yuan Pengbin;Chen Liang;Ren Liang;Yu Chi(Institute of Scientific and Technological Information of China,Beijing 100038)
出处 《情报杂志》 北大核心 2024年第6期126-133,144,共9页 Journal of Intelligence
基金 国家社会科学基金青年项目“基于多源知识网络的颠覆性技术分类识别方法研究”(编号:21CTQ039)研究成果。
关键词 专利实体 专利文本 专利挖掘 技术识别 深度学习 机器学习 Bert-BiLSTM-CRF模型 patent entity patent text patent mining technology recognition deep learning machine learning Bert-BiLSTM-CRF model
  • 相关文献

参考文献3

二级参考文献56

  • 1夏天,樊孝忠,刘林.利用JNI实现ICTCLAS系统的Java调用[J].计算机应用,2004,24(B12):177-178. 被引量:24
  • 2葛煦,卢宝华,杨湘华.谈高校科技发展中专利文献的利用[J].技术与创新管理,2005,26(1):68-70. 被引量:6
  • 3王庆民.专利信息的情报功能和专利情报分析[J].现代情报,2007,27(7):223-225. 被引量:39
  • 4Vintar S,Buitelaar P,Ripplinger B. et al. An Efficient and Flexible Format for Linguistic and Semantic Annotation: Proceedings of LREC [ J ]. Online Review, 2003,13 ( 6 ) :466 - 469.
  • 5ArtEquAkt from The University of Southampton [ EB/OL]. [ 2008 - 08-30]. http ://www. aktors, org/technologies/artequakt/.
  • 6Advanced Knowledge Technologies [ EB/OL]. [ 2008 - 08 - 30 ]. http ://www. aktors, org/akt/.
  • 7Semantic Knowledge Technologies [ EB/OL]. [ 2008 - 08 - 30 ]. http ://www. sekt - project, com/.
  • 8Intelligent Search Agent for Information Extraction and Synthesis on the Web [ EB/OL ]. [ 2008 -08 -30 ]. http ://www. ntu. edu. sg,/ sci/research/knowledge, html.
  • 9What is Protege[ EB/OL]. [ 2008 -06 -10 ]. http://protege. stanford, edu/overview/index, html.
  • 10GATE : An Application Developer' s Guide [ EB/OL ]. [ 2008 - 06 - 30 ]. http ://www. dcs. shef. ac. uk/- valyt, diana, kalian, Hamish.

共引文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部