摘要
借助IPC分类分析算法,对全文专利文献结构中几个重要单元段落的技术含量评估,来观测每个单元段落其技术含量对专利文献分类的贡献,从而有针对性地调整有效的专利自动分类分析源。避免专利自动分类盲目大数据量运算导致专利自动分类效率降低。本文对专利自动分类选择何种数据源和代价以及制定算法策略具有指导意义。
With the analysis of the IPC classification algorithm, the full text in some important passages in the patent document structure of technical evaluation, each paragraph to measure its contribution to the classification of patent document, technical content and adjust the effective automatic categorization for the patents with corresponding analysis of the source. To avoid the automatic categorization for the patents blind large amount of data operation efficiency reduces the automatic categorization for the patents. In this paper, the automatic categorization for the patents to choose what kind of data source and the cost and raise has guiding significance to the set algorithm.
出处
《电脑知识与技术》
2016年第1X期215-218,共4页
Computer Knowledge and Technology
关键词
IPC分类
分类表
TF-IDF
相似度算法
文献结构
技术分布分析
The IPC classification
Classification table
TF-IDF
Similarity algorithm
The literature structure
Analysis of tech nical distribution