摘要
针对Web中文文本分类中现有权重计算方法的不足和SVM算法对大数据量模式分类的低效性,提出了基于粗糙集约简并且加权的SVM分类方法。粗糙集作为SVM分类的前期预处理器,应用粗糙集的约简理论和基于Web中文文本的可变精度粗糙集加权方法对分类前的数据分别进行简化并计算权重,从而提高SVM后期分类的效率和精度。实验结果表明,SVM对约简并加权后的数据进行分类,分类性能得到了进一步保证。
The existing weight computation methods and support vector machine (SVM) algorithm are low-efficiency in large amount of data pattern classification of Chinese web text, so a classification algorithm of SVM based on rough set reduction and weighting is proposed in the paper. In the algorithm, rough set is used as preprocessor of SVM classification, and then the data is simplified with weight calculated using rough set reduction theory and variable precision rough set weighting method , in order to enhance the efficiency and precision of classification. The experiment result shows that the classification performance is further improved.
出处
《微型机与应用》
2014年第20期55-57,61,共4页
Microcomputer & Its Applications
关键词
SVM
粗糙集
约简
加权
SVM
SVM
rough set
reduction
weighting