摘要
结合粗糙集的属性约简和神经网络的分类机理,提出了一种混合算法.首先应用粗糙集理论的属性约简作为预处理器,把冗余的属性从决策表中删去,然后运用神经网络进行分类.这样可以大大降低向量维数,克服粗糙集对于决策表噪声比较敏感的缺点.试验结果表明,与朴素贝叶斯、SVM、KNN传统分类方法相比,该方法在保持分类精度的基础上,分类速度有明显的提高,体现出较好的稳定性和容错性,尤其适用于特征向量多且难以分类的文本.
A Hybrid Classifier is presented based on the combination of rough set theory and BP neural network. Firstly, the documents are denoted by vector space model. Secondly it reduced the feature vector by using rough sets. Finally classed the documents by BP neural network. Experimental results show that the algorithm Rough-ANN is effective for the texts classification, and has the better performance in classification precision, stability and fault-tolerance comparing with the traditional classification methods, Bayesian classifiers SVM and kNN, especially for the complex classification problems with many feature vectors.
出处
《情报学报》
CSSCI
北大核心
2006年第4期475-480,共6页
Journal of the China Society for Scientific and Technical Information
关键词
文本分类
粗糙集
神经网络
属性约简
VSM
text classification, rough-sets, neural networks, attribute reduction, VSM