摘要
为了高效地解决Web文档分类问题,提出了一种基于核鉴别分析方法KDA和SVM的文档分类算法。该算法首先利用KDA对训练集中的高维Web文档空间进行降维,然后在降维后的低维特征空间中利用乘性更新规则优化的SVM进行分类预测。采用了文档分类领域两个著名的数据集Reuters-21578和20-Newsgroup进行实验,实验结果表明该算法不仅获得了更高的分类准确率,而且具有较少的运行时间。
To efficiently solve Web document classification problem,a novel document classification algorithm based on kernel discriminant analysis(KDA) and SVM was proposed.The proposed algorithm firstly reduced the high dimensional Web document space in the training sets to the lower dimensional space with KDA algorithm,then the classification and predication in the lower dimensional feature space were implemented with the multiplicative update-based optimal SVM.The experimental evaluations were performed on the Reu...
出处
《计算机应用》
CSCD
北大核心
2009年第2期416-418,共3页
journal of Computer Applications
基金
教育部科学技术研究重点资助项目(107021)
关键词
文档分类
核鉴别分析
支持向量机
数据挖掘
document classification
Kernel Discriminant Analysis(KDA)
Support Vector Machine(SVM)
data mining