摘要
针对文本分类中信息增益降维方法的不足,提出了一种基于相对文档频的平衡信息增益(RDFBIG)降维方法.实验结果表明,RDFBIG能有效消除不同类别之间语料规模对分类精度的影响,取得了较好的分类效果.
To overcome the shortage of information gain in text categorization, this paper proposes a method of feature reduction based on the relative document frequency balance information gain (RDFBIG). Experimental results show that RDFBIG can effectively eliminate the impact of corpus scale in different classes, and achieve better results in text categorization.
出处
《江西理工大学学报》
CAS
2008年第5期68-71,共4页
Journal of Jiangxi University of Science and Technology
关键词
相对文档频
特征降维
信息增益
文本分类
relative document frequency
feature reduction
information gain
text categorization