期刊文献+

基于位置的文本特征加权方法研究 被引量:9

A Study of Text Term Weighting Based on Position
下载PDF
导出
摘要 TF-IDF是文本特征赋权的常用方法.该方法简单易行,但没有考虑位置因素对特征赋权的影响.通过修改因子,分析不同条件下文本表现形式的差异,提出3个基于位置的文本特征加权方法.随后的文本分类试验表明,此加权模型相比较于传统的方法,均具有较好的文本标注效果. TF-IDF is a kind of common methods used to measure the terms in a document.This method is easy but it considers no factor of the position.By modifying the TF-IDF with the position information and analyzing the difference of texts form under the different situation,we put forward three means based on positions to weight the terms.We have a test about text categorization and the result shows that these methods have a better precision than the common TF-IDF.
出处 《微电子学与计算机》 CSCD 北大核心 2009年第2期188-192,共5页 Microelectronics & Computer
基金 国家自然科学基金项目(70571087)
关键词 特征加权 位置加权 改进 文本分类 feature weighting position weighting text classification modified TF-IDF
  • 相关文献

参考文献7

二级参考文献49

  • 1薛鹏军.基于知识库的中文网络检索工具--经济信息智搜索引擎研究.南京农业大学硕士论文[M].,2001..
  • 2何源.警惕E—mail地址成为商品.电脑报,2003,5(2).
  • 3.[EB/OL].http://www.lub.lu.se/tk/demos/class—ws/weighting.htm,2001—05.
  • 4Baxendale, P. E. Machine-made index for technical literature an experiment. IBM. Journal of Research and Development, 1958, 2 (4) :354 ~ 361.
  • 5Nick Craswell and David Hawking.Overview of the TREC-2002 Web Track.The 10th Text Retrieval Conference,Gaithersburg,2002
  • 6Nick Craswell and David Hawking.Overview of the TREC-2003 Web Track.The 10th Text Retrieval Conference,Gaithersburg,2003
  • 7Min Zhang,etc.THU TREC 2002: Web track experiments.In: Proceedings of Text Retrieval Conference,2002.586
  • 8Shuang Liu,Clement Yu,Wensheng Wu.UIC at TREC 2002: Web Track.In: Proceedings of Text Retrieval Conference,2002.658
  • 9Vo Ngoc Anh,Alistair Moffat.Homepage finding and topic distillation using a common retrieval strategy.In: Proceedings of Text Retrieval Conference,2002.733
  • 10Einat Amitay,David Carmel,Adam Darlow.Topic distillation with knowledge agents.In: Proceedings of Text Retrieval Conference,2002.263

共引文献103

同被引文献89

引证文献9

二级引证文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部