摘要
大数据时代,具有多维海量特征的电力、医疗等行业的分类数据往往是不平衡数据,少数类样本的分类往往伴随着很大的错分代价。对于不同的数据集,数据样本点分布特征也会影响分类器的分类精度。传统的KSVM分类器增加了分类超平面附近易错分点的有效分类信息,但与此同时引入了更多噪声。针对KSVM算法应用在不平衡数据时阈值固定的缺陷,提出一种动态调整阈值的ε-KSVM分类器,降低错分信息的引入。实验表明预测精度得到较大的提升。
Abstract In the era of Big Data, the classification data of electricity, medical and other industries with multidimensional mass characteristics are often unbalanced data, a small number of samples of the classification is often wrong. According to different datasets, the distribution tendency of datasets may affect the accuracy of classifiers. The traditional Classifier KSVM adds the effective classification information for error-prone points near the hyperplane, but at the same time it introduces more noise. Based on the defect that the KSVM algorithm with fixed .threshold applied to unbalanced datasets, this paper proposes an improved e-KSVM classifier with thresholds of dynamic adjustment for different datasets so that the misclassification information is reduced. The experimental results showed that the prediction accuracy was improved greatly.
出处
《计算机应用与软件》
北大核心
2018年第1期276-280,303,共6页
Computer Applications and Software
基金
国网科技部项目(SGTYHT/14-JS-188)