期刊文献+

基于多重过滤策略的科技文献自动标引方法研究 被引量:1

Research on Automatic Indexing Method for Scientific Literatures Based on Multi-filtering Strategies
下载PDF
导出
摘要 文章提出一种基于多重过滤策略的科技文献自动标引方法,该方法不依赖于大规模训练语料,很容易作为处理模块嵌入到其他文本处理环节中,实验结果验证了方法的可行性。另外,还提出了一种基于二次文献的标引词评价方法。该方法虽然严重依赖于二次文献中给出的摘要和关键词的质量,但在人力和物力资源不足以支持建立一个高质量测试集的条件下是有价值的,制定更加合理有效的评测方案势在必行。 This paper proposes an automatic indexing method for scientific literatures based on multi-filtering strategies. The method does not rely on large-scale training corpus, and is easy to be embedded into the other text processing links as a processing module. The experimental results verify the feasibility of the method. Moreover, the paper proposes an evaluation method for index terms based on secondary literatures. Although the method relies heavily on the quality of the abstract and keywords of the secondary literature, it' s valuable under the conditions when the human and material resources are insufficient to support the establishment of a high quality test set. It' s imperative to formulate a more rational and efficient evaluation scheme.
出处 《情报理论与实践》 CSSCI 北大核心 2012年第12期98-100,110,共4页 Information Studies:Theory & Application
基金 中国科学技术信息研究所学科建设课题"自然语言处理"(项目编号:XK2011-6) 中国科学技术信息研究所重点工作课题"多语言信息获取关键技术研究与应用示范"(项目编号:ZD2011-3-3) 中国科学技术信息研究所科研项目预研资金(项目编号:YY-201121)支持
关键词 多重过滤 科技文献 自动标引 multi-filtering S&T document automatic indexing
  • 相关文献

参考文献6

  • 1LUHN H P. The automatic creation of literature abstracts [ J ]. IBM, Journal of Research & Development, 1958 (2): 159-165.
  • 2WITYEN I H, PAYNTER G W, FRANK E, et al. KEA: practi- cal automatic keyphrase extraction [ C] //Proceedings of the 4th ACM Conference on Digital Libraries ( DL' 99 ), Berkeley, USA, 1999: 254-255.
  • 3KELLEHER D, LUZ S. Automatic hypertext keyphrase detection [ C] //Proceedings fo the 19th International Joint Conference on Artificial Intelligence, Edinburgh, UK, 2005 : 1608-1609.
  • 4E1-BELTAGY S R, RAFEA A. KP-Miner: a keyphrase extrac- tion system for English and Arabic documents [ J ]. Information Systems, 2009, 34 (1): 132-144.
  • 5LIU Zhiyuan, HUANG Wenyi, ZHENG Yabin, et al. Automatic keyphrase extraction in natural language decomposition [ C ] // Proceedings of the 2010 Conference on empirical methods in nat- ural language processing, Cambridge, USA, 2010: 366-376.
  • 6ZHAO W X, JIANG Jing, HE Jing, et al. Topic keyphrase ex- traction from Twitter [ C]. The 49th annual meeting of the asso- ciation for computational linguistics: human language technolo- gies, Porland, USA, 2011: 379-388.

同被引文献9

  • 1TURNEY P D. Learning algorithms for keyphrase extraction [J]. Information Retrieval, 2000, 2 (4): 303-336.
  • 2WITTEN I H, PAYNTER G W, FRANK E, et al. KEA: practical automatic keyphrase extraction [ C ] // Proceeding of the 4th ACM Conference on Digital Libraries. Berkeley, USA: ACM Press, 1999: 254-255.
  • 3HULTH A. Improved automatic keyword extraction given more linguistic knowledge [ C ] //Proceeding of EMNI P' 03. Stroudshurg : ACL, 2003.
  • 4NGUYEN T, KAN M Y. Keyphrase extraction in scientific publications [C] //Proceedings of the 10th International Con- ference on Asian Digital Libraries, 2007: 317-326.
  • 5MIHALCEA R, TARAU P. Textrank: bringing order into texts [ C 1 //Proceedings of EMNLP. 2004 : 404-411.
  • 6PASQUIER C. Task 5: single document keyphrase extracting using sentence clustering and latent dirichlet allocation [ C ] // Proc of ACL Wordshop on semantic Evaluation. 2010 : 154-157.
  • 7LIU Zhiyuan, CHEN Xinxiong, ZHENG Yabin, et al. Auto- matic keyphrase extraction by bridging vocabulary gap [ C ] // Proceedings of the Fifteenth Conference on Computational Natu- ral Language Learning, 2011 : 135-144.
  • 8刘开瑛,薛翠芳,郑家恒,周晓强.中文文本中抽取特征信息的区域与技术[J].中文信息学报,1998,12(2):1-7. 被引量:45
  • 9李鹏,王斌,石志伟,崔雅超,李恒训.Tag-TextRank:一种基于Tag的网页关键词抽取方法[J].计算机研究与发展,2012,49(11):2344-2351. 被引量:56

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部