摘要
针对Web信息过滤问题,提出一种将粗糙集理论和决策树SVM(DT_SVM)相结合进行数据分类、过滤的新方法。该方法运用改进的启发式相对属性约简算法消除冗余、降低样本空间维数,通过聚类和DT_SVM相结合来训练SVM,将多分类问题转化为二值分类问题,提高了训练速度及过滤精度。实验表明,该算法得到了较高的查全率、查准率,体现了将粗糙集理论与DT_SVM算法结合的优越性。
This paper advances a new data classification and filtering method based on rough set theory and Decision Tree SVM (DT_SVM) in allusion to the problem of Web information filtering. This method utilizes an improved heuristic algorithm of relative attribute reduction to eliminate redundancy, debase the spacial dimension of sample data, and train SVM by clustering integrated with DT SVM, it can change multiclass problem into binary classification, and improve the training speed and the filtering precision. Experimental results demonstrate that the new algorithm gains a higher filtering recall and precision, manifests the algorithm's advantage of rough set theory integrated with DT SVM.
出处
《计算机工程》
CAS
CSCD
北大核心
2008年第15期208-210,共3页
Computer Engineering
基金
黑龙江省研究生创新科研基金资金项目(YJSCX2006-38HLJ)