摘要
非平衡数据集在金融、商业以及学术的研究等诸多的领域有着广泛的应用,主要研究的是对于非平衡数据集的处理和分类问题,首先使用了Smote算法对于非平衡数据集进行平衡化处理,然后采用Weka软件中提供的分类算法建立分类模型,最后与没有进行预处理而建立的分类模型进行分析和比较,验证了Smote算法对于非平衡数据集分类的必要性,同时也指出有待于进一步的改进。
The imbalanced dataset has broad applications in many fields,such as finance,business and scientific research,so the research of the imbalanced dataset has theoretical and practical significant.It takes main study in processing and classification of imbalanced dataset,firstly,it takes balance of processing with synthetic minority over-sampling technique algorithm in imbalanced dataset,then it establishes classification model with classification applied in weka,the last compared with no pretreatment of the established classification model and analyzing,it verified that synthetic minority over-sampling technique algorithm has its necessary,and in the same need to be further improved.
出处
《电力学报》
2010年第4期349-352,共4页
Journal of Electric Power