期刊文献+

TFLD:一种中文文本关键词自动提取方法 被引量:4

TFLD:a novel phrase-extraction method for Chinese text
下载PDF
导出
摘要 为了提高中文关键词提取的准确率和实用性,提出了一种改进了候选词权重计算的关键词提取算法TFLD(term frequency,lo-cation&distance algorithm),利用候选词权重排序自学习,提高了提取关键词算法的效率。该方法采用词语词频统计、分布区域以及词语距离位序3种特征项,并使用最小均平方(LMS)法则训练算法模型的调整因子。实验结果表明,该方法提高了关键词提取的精度。 Aiming at improving accuracy and practicality of key-phrase extraction for Chinese,a new algorithm was proposed,which named as TFLD(term frequency,location distance algorithm),the calculation accuracy by obtaining a sorted candidate keyword sequence was improved.Based on word frequency features including statistic of term frequency,term location and term distance,the least mean square(LMS) algorithm was trained to calculate the parameters for TFLD algorithm.The experimental results show that the proposed method improves the accuracy of key-phrase extraction in a considerable magnitude.
作者 管瑞霞 陆蓓
出处 《机电工程》 CAS 2010年第9期123-126,共4页 Journal of Mechanical & Electrical Engineering
关键词 关键词提取 中文文本 中文信息处理 key-words extraction Chinese text Chinese information processing
  • 相关文献

参考文献13

  • 1TURNEY P D. Learning to Extract Key Phrases from Text [ R]. NRC Technical Report ERB-1057, National Research Council, Canada, 1999 : 1 - 43.
  • 2WHITLEY D. The GENITOR Algorithm and Selective Pressure [ C ]//Proceedings of the Third International Conference on Genetic Algorithms. California: Morgan Kaufmann, 1989 : 116 - 121.
  • 3FRANK E, PAYNTER G W, WITTEN I H. Domain-Specific Key Phrase Extraction[ C ]//Proceedings of the 16th International Joint Conference on Artificial Intelligence. Stockholm, Sweden : Morgan Kaufmann, 1999:668 - 673.
  • 4CHIEN L F. PAT-tree-based Keyword Extraction for Chinese Information Retrieval [ C ]//Proceedings of 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Philadelphia: [ s. n, ] , 1997:50 - 59.
  • 5李素建,王厚峰,俞士汶,辛乘胜.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197. 被引量:92
  • 6方俊,郭雷,王晓东.基于语义的关键词提取算法[J].计算机科学,2008,35(6):148-151. 被引量:39
  • 7索红光,刘玉树,曹淑英.一种基于词汇链的关键词抽取方法[J].中文信息学报,2006,20(6):25-30. 被引量:88
  • 8金翔宇,孙正兴,张福炎.一种中文文档的非受限无词典抽词方法[J].中文信息学报,2001,15(6):33-39. 被引量:28
  • 9NIE Jian-yun, GAO Jiang-feng, ZHANG Jian, et al. On the Use of Words and N-grams for Chinese Information Retrieval [ C]//Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages. New York: ACM Press ,2000 : 141 - 148.
  • 10郑家恒,卢娇丽.关键词抽取方法的研究[J].计算机工程,2005,31(18):194-196. 被引量:41

二级参考文献35

共引文献289

同被引文献36

引证文献4

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部