期刊文献+

自动标引通用评价模型研究 被引量:6

General Evaluation Model for Automatic Indexing
下载PDF
导出
摘要 目前大多文档都不具有关键词,但手工标引关键词费时费力且主观性较强,因此关键词自动标引是一项值得研究的技术,由此引发的标引结果有效评价问题也成为一个亟需解决的问题。然而,评估关键词自动标引的性能并非一件容易的事情。针对常规自动标引评价方法存在的评价结果不能完全反映真实的标引结果以及评价成本高的情况,本文提出一种通用的自动标引评价模型。该模型可以有效地利用外部资源,在有参照情况下与无参照情况下,分别对标引结果进行评价。实验结果表明,自动标引通用评价模型能增加标引评价的可靠性,并且降低标引评价的成本。 Currently, a large portion of documents still do not have keywords assigned. At the same time, manual assignment of high quality keywords is expensive, time-consuming, and error prone. Therefore, it is worth studying on automatic keywords indexing and it is very necessary to evaluate the indexing results effectively. However, it is not always easy to evaluate the performance of keywords indexing system. The traditional evaluation methods cannot reflect the real results due to the exact match between the indexing data and the test data. Meanwhile, the cost of traditional evaluation methods is expected to be reduced. The general evaluation model of automatic indexing can take full advantage of the external knowledge resource to evaluate the results of automatic indexing. Tile evaluation method is divided into the reference-based evaluation and without-reference-based evaluation. Experimental results show that the general evaluation model can enhance the reliability and reduce the cost of evaluation.
出处 《情报学报》 CSSCI 北大核心 2009年第1期40-47,共8页 Journal of the China Society for Scientific and Technical Information
基金 本研究受“十一五”国家科技支撑计划重点项目(2006BAH03B02)、南京理工大学青年科研扶持基金项目(JGQN0701)、南京理工大学科研启动基金项目(AB41123)、2006年江苏省研究生培养创新工程项目资助.
关键词 自动标引 评价模型 语义相似度 相似度计算 automatic indexing, evaluation model, semantic similarity, similarity computation
  • 相关文献

参考文献15

  • 1曾元显.关键词自动提取技术与相关词反馈.中国图书馆学会会报,1997,(59):59-64.
  • 2李素建,王厚峰,俞士汶,辛乘胜.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197. 被引量:92
  • 3Chien L F. PAT-tree-based keyword extraction for Chinese information retrieval [ C ]//Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Philadelphia, PA, USA, 1997:50-59.
  • 4Turney P D. Learning algorithms for keyphrase extraction [J]. Information Retrieval, 2000, 2(4): 303-336.
  • 5Moens M F. Automatic Indexing and Abstracting of Document Texts [ M ]. Boston/Dordrecht/London : Kluwer Academic Publishers, 2000:78, 104.
  • 6Zhang K, Xu H, Tang J, et al. Keyword extraction using support vector machine [ C ] // Proceedings of the 6th International Conference on Advances in Web- Age Information Management Conference. Hong Kong, China, 2006 : 85-96.
  • 7Tumey P D. Mining the Web for Lexical Knowledge to Improve Keyphrase Extraction: Learning from Labeled and Unlabeled Data. Technical Report ERB-1096[R]. National Research Council Canada, 2002 : 1-34.
  • 8Deerwester S, Dumais S T, Landauer T K, et al. Indexing by latent semantic analysis [ J]. Journal of the American Society for Information Science, 1990, 41(6) : 391-407.
  • 9Sahon G, Wong A, Yang C S. A vector space model for automatic indexing[J]. Communications of ACM, 1975, 18 (11): 613-620.
  • 10侯汉清 ,章成志 ,郑红 .Web概念挖掘中标引源加权方案初探[J].情报学报,2005,24(1):87-92. 被引量:32

二级参考文献7

共引文献160

同被引文献51

引证文献6

二级引证文献61

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部