期刊文献+

一种基于TFIDF方法的中文关键词抽取算法 被引量:65

A Chinese Keyword Extraction Algorithm Based on TFIDF Method
下载PDF
导出
摘要 本文在海量智能分词基础之上,提出了一种基于向量空间模型和TFIDF方法的中文关键词抽取算法。该算法在对文本进行自动分词后,用TFIDF方法对文献空间中的每个词进行权重计算,然后根据计算结果抽取出科技文献的关键词。通过自编软件进行的实验测试表明该算法对中文科技文献的关键词自动抽取成效显著。 On the basis of Massive Intelligent Segmentation, this paper proposes a Chinese keyword extracting algorithm based on Vector Space Model and TFIDF method. After automatic segmentation of text, this algorithm calculates the weight of every word in document space with TFIDF method and extracts the keywords of scientific and technical documents according to the calculation result. The experimental test with self-compiled software indicates the algorithm improves the efficiency of automatic keyword extraction of Chinese scientific and technical documents obviously.
出处 《情报理论与实践》 CSSCI 北大核心 2008年第2期298-302,共5页 Information Studies:Theory & Application
关键词 关键词抽取 向量空间模型 算法 keyword extraction VSM algorithm
  • 相关文献

参考文献8

二级参考文献31

  • 1李素建,王厚峰,俞士汶,辛乘胜.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197. 被引量:92
  • 2郑家恒,卢娇丽.关键词抽取方法的研究[J].计算机工程,2005,31(18):194-196. 被引量:41
  • 3王军.词表的自动丰富——从元数据中提取关键词及其定位[J].中文信息学报,2005,19(6):36-43. 被引量:40
  • 4Gilchrist, A. D. Classification and thesauri [ A ]. In: Vickery, B. (ed.). Fifty Years of Information Progress: a Journal of Documentation Review [C]. London: ASLIB. 1994. 85- 118.
  • 5Foskett, D.J. Thesaurus [A]. In:A. Kent, H. Lancours, and J.E. Daily (Eds.) Encyclopedia of Library and Informarion Science[C], NY: Marcel Dekker. 1980. Vol.30, 416 - 462.
  • 6张琪玉.当代中国的分类法与主题词表[A]..张琪玉情报语言学文集[c].北京:北京图书馆出版社,1999.211-229.
  • 7Shiri, A. A.. Thesauri on the Web: current developments and trends [J], Online Information Review, 2000, 24(4):273 - 279.
  • 8Schütze, et al. A cooccurrence - based thesaurus and two applications to information retrieval [J], Info. Processing and Management: an Int. J, 2000, 33(3):307- 318.
  • 9Güntzer, U., et al. Automatic thesaurus construction by machine learning from retrieval sessions [J], Information Processing and Management: an International Journal, 1989, 25(3) :265 - 273.
  • 10Park, Y.C., Han, Y.S. & Choi, K.S. Automatic thesaurus construction using Bayesian network [ A], Proceeding of the Conference on Information and Knowledge Management[C], Baltimore MD: Association for Computing Machinery 1995, pp. 212- 217.

共引文献177

同被引文献602

引证文献65

二级引证文献649

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部