摘要
在中国知网、万方数据和Web of Science进行检索,获得72篇相关中文文献和98篇英文文献,并从中选出66篇进行综述。专利挖掘研究包括术语抽取、聚类、分类、以复杂网络为基础的方法、以时间为基础的方法和基于专利挖掘的技术研究等6个主题。尽管近10年来这一领域发展较快,但是部分研究也存在试验验证不精确、基于IPC的自动分类效果不好、所要解决的问题不明确且局限于方法应用和粒度粗糙等问题。专利挖掘研究应该注重发现问题,而非简单地应用方法。
The paper performs an exhausted literature review on patent mining. The authors retrieve CNKI, Wanfangdata and Web of Science, and gets 72 Chinese and 98 English papers. The review carefully chooses 66 from them. Patent mining includes the following topics: term extraction, clustering, categorization, methods based on complex network, methods based on time and technical studies based on patent mining. Although patent mining develops fast in last 10 years, there are problems calling for improvements. For example, test experiment lacks accuracy; the effectiveness of automatic categorization based on IPC is not satisfactory; the research question is not clear and certain research is contented with the application of a given method; the granularity is coarse, etc. The paper concludes that patent mining should pay adequate attention to putting forward new questions rather than merely applying existed methods to patents.
出处
《图书情报工作》
CSSCI
北大核心
2014年第20期131-137,共7页
Library and Information Service
基金
中国博士后科学基金特别资助项目"面向信息分析的专利文本挖掘研究"(项目编号:2013T60151)
国际合作项目"面向科技文献的日汉双向实用型机器翻译合作研究"(项目编号:2014DFA11350)研究成果之一
关键词
专利
数据挖掘
文本挖掘
术语抽取
聚类
分类
复杂网络分析
时间序列分析
生存分析
patent data mining text mining term extraction clustering categorization complex network analysis time series analysis survival analysis