期刊文献+

基于同义词词林的中文文本主题词提取 被引量:11

Thematic Words Extracting from Chinese Text Based on Tongyici Cilin
下载PDF
导出
摘要 中文文本主题词的提取可以浓缩一篇文章,可以提炼一个中文网页,还可以帮助实现网上广告与网页的精确匹配。提出了一种基于同义词词林的中文文本主题词提取方法,不仅考虑了传统的影响主题词语权重的因素,还考虑到了同义词、相关词以及下位词的出现对于词语权重的影响。实验表明。 Thematic words extraction from a Chinese text not only can concentrate an article,but also can extract main ideas from a Chinese Web and help to achieve precise matching between online advertisement and a webpage. The paper presents a method of thematic words extraction based on Tongyici Cilin. The method not only has taken traditional factors affecting the weight of a thematic word into account, but also has considered the factors such as the appearance of relevant words, synonymy and lower words. Experiments have confirmed that the accuracy rate of thematic word extraction from a Chinese text can reach 83.25% using this method.
出处 《广西师范大学学报(自然科学版)》 CAS 北大核心 2007年第2期145-148,共4页 Journal of Guangxi Normal University:Natural Science Edition
基金 国家自然科学基金资助项目(60272084) 北京市教育委员会科技发展计划重点项目(KZ200310772013) 北京市教委项目(KM200510772008 KM200610772008)
关键词 主题词提取 同义词词林 权值 同义词 thematic words extraction tongyici cilin weight synonymy
  • 相关文献

参考文献6

二级参考文献64

  • 1唐振民,靳从,杨静宇,李远复.一种用于自动标引系统的主题词自动切分方法[J].南京理工大学学报,1995,19(5):401-404. 被引量:2
  • 2牛凯.中文科技文献计算机自动标引系统的研究[J].情报学报,1995,14(1):16-26. 被引量:2
  • 3靳从,樊春丽,杨静宇.主题词自动标引中的知识处理方法[J].情报理论与实践,1996,19(2):30-33. 被引量:3
  • 4Jing Y,Croft W B. An Association Thesaurus for Information Retrieval. In :Proc. of RIAO 1994, C. I. D. , Paris, 1994. 146~160
  • 5Fletcher J. http://www.stir.ac.uk/jsbin/js
  • 6Marchiori M. The Quest for Correct Information on the Web:Hyper Search Engines. The Sixth Intl. WWW Conf. (WWW 97).Santa Clara, USA, April 1997.
  • 7Spertus E. ParaSite: Mining Structural Information on the Web.The Sixth Intl. WWW Conf. (WWW 97). Santa Clara, USA,April 1997
  • 8Weiss R,et al. HyPursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering. In: Proc. the 7th ACM Conf. on Hypertext. New York, 1996
  • 9Kleinberg J. Authoritative Sources in a Hyperlinked Environment. In: Proc. ACM-SIAM Symposium on Discrete Algorithms,1998
  • 10Page L,et al. The PageRank Citation Ranking: Bringing Order to the Web

共引文献89

同被引文献88

引证文献11

二级引证文献253

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部