摘要
针对基于word2vec的专利语义检索方法无法处理多义词引起的召回率低,同时需要大量内积运算从而检索效率低下等问题,本文设计了一种基于Bert与Milvus的专利智能语义检索系统。系统将专利标题和摘要文本通过Bert预训练模型转化为词向量,并导入Milvus向量检索引擎,从而实现语义检索功能。同时,基于Django架构,在前端使用Vue.js进行网页的设计,再结合MySQL数据库,最终实现了一个专利的智能语义查询系统。系统分为登录管理模块、系统模块、个人中心模块、用户数据管理模块、检索模块五个模块的分析设计与实现。开发的智能语义检索系统在手工标注的专利数据集上测试,其检索召回率达到86%,平均准确率达到80%。实验证明通过Bert能有效提高检索准确率。同时,通过Bert结合Milvus,可以快速搭建智能语义检索系统。
Aiming at the problems of low recall rate caused by word2vec and low patent retrieval efficiency resulted by the exhausted inner product operation, this paper designs a patent intelligent semantic retrieval system based on Bert and Milvus. The designed system converts the patent title and abstract text into word vectors through the Bert(Bidirectional Encoder Representation from Transformers) pre-training model, and imports them into the Milvus vector retrieval engine to realize the semantic retrieval function.At the same time, a patented intelligent semantic query system is implemented based on the Django architecture combined with MySQL database. Vue. js is used in the front-end for web page design. The system is divided into five modules: login management module, system module, personal center module,user data management module, and retrieval module. The developed intelligent semantic retrieval system was tested on a hand-labeled patent dataset, and its search recall rate reached 86%, and the average accuracy reached 80%. Experimental results show that the retrieval accuracy can be effectively improved by using Bert. In addition, an intelligent semantic retrieval system can be quickly implemented by combining Bert and Milvus.
作者
许林
XU Lin(School of Intelligent Medicine,Chengdu University of Traditional Chinese Medicine,Chengdu 611137;School of Control Engineering,Chengdu University of Information Technology,Chengdu 610025)
出处
《中国发明与专利》
2023年第2期5-11,共7页
China Invention & Patent
基金
四川省科技计划重点研发项目(基于深度学习的专利智能检索与分析系统,编号:2021YFG0308)
四川省机器人与智能系统国际联合研究中心开放研究课题(多智能体协同视觉SLAM,JQZN2021-005)资助。