摘要
为了在产品创新设计过程为设计者提供跨领域的专利知识,提出一种以功能基为分类标准的中文专利文本分类方法。针对功能基类别多、专利文本训练集少的特点,从简化类别数量和增加数据集2个角度出发,采用多重二分类监督分类算法和基于EM算法的半监督分类算法,以朴素贝叶斯(NB)完全有监督算法为对照,采用正交实验,考察特征选择与数据集选择对分类准确度的影响,实现一级功能基分类准确率达到80%,基本符合应用要求。为基于功能基辅助产品创新设计专利知识库的构建,提供了相关的技术支持。
In order to provide interdisciplinary patent knowledge for designer in product innovative design process, a Chinese patent text classification method was proposed based on functional basis. Because functional basis has so many categories and the number of train- ing sets for patent text is less, multiple binary classification algorithm and EM semi-supervised learning algorithm were adopted to classi- fy Chinese patent text. Compared with the results of naive Bayesian (NB) fully supervised classification experimental, by using the or- thogonal experimental design, the selection of features and data sets were considered according to the accuracy of classification,the classifying accuracy rate of the first class reached above 80% which was in accordance with application requirements. This study provided core technologies for the construction of patent knowledge base based on functional basis in product innovative design system.
出处
《四川大学学报(工程科学版)》
EI
CAS
CSCD
北大核心
2016年第5期105-113,共9页
Journal of Sichuan University (Engineering Science Edition)
基金
国家自然科学基金重点资助项目(51435011)
高等学校博士学科点专项科研基金资助项目(20130181130011)
关键词
创新设计
功能基
专利分类
朴素贝叶斯
半监督学习
innovation design
functional base
text classification
naive Bayesian
semi-supervised learning