期刊文献+

基于改进的TF-IDF特征权重算法的网页自动分类 被引量:2

Auto-Classification of Web Page Based on the Improved TF-IDF Weighting Algorithm
下载PDF
导出
摘要 TF-IDF是文档特征权重表示常用方法,但不能真正地反映特征词对区分每个类的贡献。故针对网页分类中特征选择方法存在的问题,加入网页标签特征权重改进TF-IDF公式,提出了一种比较有效的网页分类算法,实验结果表明该方法具有较好的特征选择效果,能够有效地提高分类精度。 TF - IDF weighting is the document showing common method, which can not truly reflect the characteristics of words to distinguish the contribution of each class. To solve the existed problems, this feature selection for web page classification join the Web tab features to improve TF - IDF weighting formula, which is a more effective web page classification algorithm. Experiment results show that the method has a good effect of feature selection and can effectively improve the classification accuracy.
出处 《绵阳师范学院学报》 2010年第8期106-109,共4页 Journal of Mianyang Teachers' College
关键词 网页分类 TF-IDF 特征权重 page classification TF - IDF weighting
  • 相关文献

参考文献9

二级参考文献21

  • 1James Auen.Natural Language Understandin[M].The Benjamin/Cummings Publishing Company, 1991-05.
  • 2Apte C,Damerau F J,Weiss S M.Automated Learning of Decision Rules for Text Categorization[J].ACM Trans On Inform Syst,12(3): 233-251.
  • 3Salton G,Buckley B.Term-weighting Approaches in Automatic Text Retrieval[J].Information Processing and Management, 1998 ; 24(5 ) :513 -523.
  • 4Larkey L S.A Patent Search and Classification System[C].In:proceedings of DL-99,4th ACM Conference on Digital Libraries Berkeley,CA,1999:179-187.
  • 5Salton G,Lesk M E.Computer Evaluation of Indexing and Text Processing[J].Association for Computing Machinery, 1968 ; 15 ( 1 ) : 8-36.
  • 6Yang Y,http://citeseernjneccom/yang97comparativehtml,1997年
  • 7Lang K,Proc the 12th Int Conference on Machine Learning(ICML 95),1995年,331页
  • 8CHANG CC, LIN CJ. LIBSVM - A Library for Support Vector Machines [ EB/OL]. http:∥www. csie. ntu. edu. tw/~cjlin/libsvm,2005 -02 -02.
  • 9YANG YM. An Evaluation of Statistical Approaches to Text Categorization[J]. Journal of Information Retrieval, 1999, 1 (1/2) .
  • 10LEWIS DD. Feature selection and feature extraction for text categorization[A] . Proceedings of Speech and Natural Language Workshop[C] . San Francsico: Morgan Kaufmann, February, 1992. 212- 217.

共引文献245

同被引文献9

引证文献2

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部