期刊文献+

面向中文书籍的书后索引项提取

Extraction of index terms for Chinese books
下载PDF
导出
摘要 为提升索引编制的准确率与效率,改善基于关键词的提取算法无法很好地提取与书籍主题相关并且具有索引价值的索引项的问题,提出综合评价方式进行书后索引项的提取。利用候选索引项在知识库中的类别和引用关系,借鉴网页排名(PageRank)算法计算候选索引项的领域重要度;对书籍内部信息进行全面分析,利用统计、位置等特征计算候选索引项的书籍内部重要度;构建综合评价模型评价候选索引项作为书后索引项的适合程度。实验结果表明,所提方法在准确率、召回率和F值方面比未改进的算法有显著提高。 To improve the accuracy and efficiency of indexing,and improve the keyword-based extraction algorithm which cannot extract the items related to the subject and valuable to the back-of-the-book index,a comprehensive evaluation method was proposed to extract the index terms.The candidate index terms were extracted according to the category structure and reference relationship in the knowledge base,and their domain confidence was calculated using the PageRank algorithm.A comprehensive analysis of the internal information of the books was conducted and the internal importance of the candidate index terms was calculated using the characteristics of statistical and location,etc.A comprehensive evaluation model was established to evaluate the suitability of candidates as the back-of-the-book index terms.Experimental results show that the proposed method is better than the original algorithm in accuracy,recall and F-measure.
作者 田梦 李宁 吕淑琪 田英爱 许洁 TIAN Meng;LI Ning;LYU Shu-qi;TIAN Ying-ai;XUN Jie(Computer School,Beijing Information Science and Technology University,Beijing 100101,China;China Electronics Standardization Institute,Beijing 100007,China)
出处 《计算机工程与设计》 北大核心 2019年第1期261-267,共7页 Computer Engineering and Design
基金 国家自然科学基金项目(61672105) 国家863高技术研究发展计划基金项目(2015AA015403) "核高基"国家科技重大专项基金项目(2012ZX01045-006)
关键词 书后索引 候选索引项提取 书后索引项提取 网页排名算法 特征评价 back-of-the-book index candidate index term extraction back-of-the-book index term extraction PageRank feature evaluation
  • 相关文献

参考文献2

二级参考文献7

  • 1中国索引标准起草小组.《索引编制规则(总则)》.
  • 2American Society for Indexing. Software tools for indexing. ( 2008 - 03 - 20 ). http ://www. asindexing, org/site/soflware, shtml.
  • 3http ://www. indexres.com.
  • 4http ://www. html - indexer. com/.
  • 5王彦祥,王广林.“索引之星”软件的研制和使用.(2006-05-04).http://www.cnindex.fudan.edu.cn/index-star20.htm.
  • 6徐忠.与时俱进,开创索引事业美好未来--在中国索引学会第三次全国会员代表大会上的工作报告.(2008-11-01).http://www.cnindex.fudan.edu.cn/news/200S/news_081l01.htm.
  • 7侯汉清.索引法教程.南京:南京农业大学,1993:10-16

共引文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部