摘要
近年来专利数据呈爆炸式增长,从专利文本信息中准确地获取主题信息并将其可视化逐渐成为一个重要的研究方向。专利主题演化研究能够挖掘出专利中潜在的发展模式,对相关研究具有重要参考价值。本文将分层的狄利克雷过程(HDP)应用到专利主题聚类中,通过当前主题与加入历史数据之后的主题变化来挖掘主题的分流与合流,最后对主题信息利用叠式图进行可视化展示。实验结合实际的汽车专利数据进行分析研究,发现汽车专利主要分为三个大主题,而且各个主题之间有分流、合流,有逐年递增也有逐年递减,有新生主题也有消亡主题等各种形式,并发现从2006年开始汽车安全领域和汽车新能源领域分别独立成为一个主题并呈逐年增长的趋势。
In recent years, the patent data is in cxplosive growth. Accurately extracting topic information from patent data and visualizing it is becoming an important research direction. The research of the topic evolution of vehicle patent can dig out the potential development model, which has great importance to the related study. Here, we used the Hierarchical Dirichlet process (HDP) to cluster the patent data and mine splitting and merging of the topics by comparing the topics of each year and the topics with history data clustered by HDP. At last, we visualized the relationship of the topic information using stacked graph. We used the actual vehicle patent data in the experiment and discovered that there are three major topics of the vehicle patent data. There are splitting and merging among different topics, shrinking of the topic, expanding of the topic, newborn of the topic and perishing of the topic. We also found that after 2006 the field of vehicle safety and new energy sources for vehicle became to individual topics and showed increasing trend year by year.
出处
《情报学报》
CSSCI
北大核心
2014年第9期944-951,共8页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金资助项目(编号:61277370)
辽宁省自然科学基金(编号:201202031)
关键词
HDP
主题聚类
主题演化
汽车专利
hierarchical dirichlet processes, topic clustering, topic evolution, vehicle patent