摘要
针对传统分类算法默认数据处于平衡状态,将其应用在不平衡数据上会导致分类结果不具代表性甚至无效的问题,首先利用C-Vine copula模型描述数据之间的相关结构;然后对不平衡数据集的少数类生成虚拟样本,使数据集各类别数目达到平衡;最后对平衡数据进行分类.在不平衡数据集上进行的实验表明,所提出的方法具有较好的分类表现,可以有效提高传统分类算法的分类性能.
Traditional classification algorithms tend to regard data in a balanced status by default.When unbalanced data are used,the classification results thus obtained are less representative and even invalid.Aiming at the problem of unbalanced data classification,the C-Vine copula model was used to describe relevant structures among data.Then,virtual samples were generated for unbalanced datasets in the minority which can ensure the balanced distribution of different types of datasets in quantity.Finally,the balanced dataset were classified.The experiment results show that the proposed method can improve the performance of traditional classifiers in classifying unbalanced data.
作者
关红钧
王蕾
GUAN Hongjun;WANG Lei(Normal College, Shenyang University, Shenyang 110044, China;College of Mathematics and Systems Science, Shenyang Normal University, Shenyang 110034, China)
出处
《沈阳大学学报(自然科学版)》
CAS
2021年第4期364-368,共5页
Journal of Shenyang University:Natural Science