摘要
TF-IDF是文档特征权重表示常用方法,但不能真正地反映特征词对区分每个类的贡献。故针对网页分类中特征选择方法存在的问题,加入网页标签特征权重改进TF-IDF公式,提出了一种比较有效的网页分类算法,实验结果表明该方法具有较好的特征选择效果,能够有效地提高分类精度。
TF - IDF weighting is the document showing common method, which can not truly reflect the characteristics of words to distinguish the contribution of each class. To solve the existed problems, this feature selection for web page classification join the Web tab features to improve TF - IDF weighting formula, which is a more effective web page classification algorithm. Experiment results show that the method has a good effect of feature selection and can effectively improve the classification accuracy.
出处
《绵阳师范学院学报》
2010年第8期106-109,共4页
Journal of Mianyang Teachers' College