期刊文献+

专利分类中基于主题的特征权重计算方法 被引量:3

A topic-based feature weight calculation method for patent categorization
下载PDF
导出
摘要 专利自动分类是一个大规模、多层次结构的复杂文本分类问题。其中特征权重计算是一个关键环节,关系到专利的文本表示能否体现出主题信息的问题。本文通过分析专利(标题和摘要)的特点,提出了一种基于主题的特征权重计算新方法。该方法通过考察特征与主题的相关性来确定权重,使专利的文本表示更趋近于文章的主题。实验结果表明,该方法优于一般的权重计算方法,取得了较好的效果。 Patent categorization is a large -scale and multi -hierarchy text categorization problem, in which feature weight calculation is a crucial step since it decides whether text representation can reflect topic information. On the basis of thorough analysis on the characteristics of patent title and abstract, this paper proposed a topic - based feature weight calculation method, and the weight determination with correlation of feature and topic makes the patent text closer to the topic. Experimental results show that topic - based feature weight calculation method is better than traditional methods, and leads to good performance in patent categorization.
出处 《沈阳航空工业学院学报》 2009年第1期46-48,共3页 Journal of Shenyang Institute of Aeronautical Engineering
基金 国家"863"计划基金资助项目(2006AA01Z148) 教育部科学技术研究重点项目(207148)
关键词 专利分类 文本分类 特征权重 patent categorization text categorization feature weight
  • 相关文献

参考文献5

  • 1Fall C J, Torcsvari A, Benzineb K et al. Automated Categorization in the International Patent Classification [ J ]. ACM SIGIR Forum, 2003, 37(1) :10 -25.
  • 2Larkey L S. A Patent Search and Classification System [ C ]. Berkeley: Proc. of the 4th ACM Conference on Digital Libraries, 1999 : 179 - 187.
  • 3刘玉琴,桂婕,朱东华.基于IPC知识结构的专利自动分类方法[J].计算机工程,2008,34(3):207-209. 被引量:15
  • 4Murata M, Kanamaru T, Shirado T, et al. Using the K - Nearest Neighbor Method and Smart Weighting in the Patent Document Categorization Subtask at Ntcir - 6 [ C ]. Tokyo : Proc. of the 6th NTCIR Workshop Meeting, 2007. 407 -413.
  • 5Church K W. Word Association Norms, Mutual Information, and Lexicography[ J]. Computational Linguistics, 1990, 16 ( 1 ) : 22 - 29.

二级参考文献5

  • 1丁月华,文贵华,郭炜强.基于核向量空间模型的专利分类[J].华南理工大学学报(自然科学版),2005,33(8):58-61. 被引量:12
  • 2郭炜强,戴天,文贵华.基于领域知识的专利自动分类[J].计算机工程,2005,31(23):52-54. 被引量:17
  • 3Krier M, Zacca F. Automatic Categorisation Applications at the European Patent Office[J]. World Patent Information, 2002, 24(3): 187-196.
  • 4Koster C H A, Seutter M, Beney J. Classifying Patent Applications with Winnow[C]//Proceedings of Benelearn Conference on Machine Learning. Belgium: [s. n.], 2001.
  • 5Falla C J, Torcsvari A, Fievet E Automated Categorization of German Language Patent Documents[J]. Expert Systems with Applications, 2004, 26(2): 269-277.

共引文献14

同被引文献33

  • 1郭炜强,戴天,文贵华.基于领域知识的专利自动分类[J].计算机工程,2005,31(23):52-54. 被引量:17
  • 2杨祖国,李文兰.数字专利信息资源比较及综合利用研究[J].图书馆工作与研究,2006(6):74-76. 被引量:2
  • 3董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007,21(4):3-9. 被引量:97
  • 4余峰.专利摘要的信息抽取技术研究[D].北京:北京理工大学,2006.
  • 5Tseng Y H,Juang D W,Wang Y M,et al.Text mining for patent map analysis[C].Proceedings of IACIS Pacific 2005 Conference,Taipei,2005.
  • 6Yoon B,P Y.A systematic approach for identifying technology opportunities:Keyword-based morphology analysis[J].Technological Forecasting & Social Change,2005,72(2):145 -160.
  • 7Razvan Bunescu and Raymond Mooney.Subsequence kernels for relation extraction[J].In Advances in Neural Information Processing Systems,2006,18(2):171 -178.
  • 8Ivan A.Sag and Thomas Wasow.Syntactic Theory[M].CSLI publications,1999.
  • 9Grishman R,Information Extraction:Techniques and Challenges[M].Information Extraction:a Multidisciplinary Approach to an Emerging Information Technology,Springer,Berlin,1997.
  • 10Salton G , McGill M J. Introduction to modern information retrieval[ M]. McGraw - Hill, 1983.

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部