摘要
不均衡数据问题在我们日常生活中随处可见,例如疾病诊断,矿藏资源识别等等.对于不均衡数据的分类而言,目前基于集成学习的不均衡数据分类技术较为成熟,但现有方法都将不均衡数据作为一个整体考虑,而不区别对待不同类型的不均衡数据.事实上,不均衡数据因其不均衡比、数据维度和类别数的不同,所具有的数据分布也不同,使用统一的模型处理所有不均衡数据难以在所有数据集中都获得好的效果.基于此,本文提出了一种基于差分演化算法的自适应集成学习算法(adaptive multiple classifier system based on differential evolution algorithm,DE-AMCS),使得针对不同的不均衡数据,系统能够选择最优的集成学习模型来完成分类任务.本文选择了KEEL数据集中的10个数据集进行测试,测试结果与5个现有的集成分类算法进行了对比,实验表明DEAMCS相比于对比算法,分类精度上有明显的提升.最后,本文将DE-AMCS应用到江汉油田某区五口井的石油储层含油性的识别中,在每口井的含油性识别中,精度均达到了100%.
Imbalanced data exists widely in all domains of our daily life, such as disease diagnosis, mineral resource detection, etc. For the classification of imbalanced data, while ensemble classifiers gave a promising solution for classifying such skewed data, existing ensemble classifiers assume all kinds of imbalanced data share the same characteristics, and a universal solution was carefully designed. However, imbalanced data can be unequable based on its imbalanced ratio, the number of features of the number of examples available for training, so it's difficult to get good results in all of the data set. In this paper, we propose an adaptive multiple classifier system based on differential evolution algorithm(DE-AMCS), system can choose optimal integration of learning model to complete the classification task. 10 datasets from KEEL are selected to verify the efficiency of DE-AMCS, and 5 state-of-the-art imbalanced data classification algorithms are also tested for comparison. Experimental results show that the DE-AMCS is competitive or outperforms the state-of-the-art by using various evaluation metrics as indicators. Finally, DE-AMCS is applied to 5 wells of Jianghan Oil Field. For each well, the precision reaches 100%.
作者
郭海湘
顾明赟
李诒靖
黄媛玥
王文杰
GUO Haixiang1,2,3,4,GU Mingyun1,LI Yijing1,HUANG Yuanyue1,WANG Wenjie1(1. School of Economics and Management Science, China University of Geosciences, Wuhan 430074, China; 2. Research Center for Digital Business Management, China University of Geosciences, Wuhan 430074, China; 3. Mineral Resource Strategy and Policy Research Center of China University of Geosciences, Wuhan 430074, China; 4. Key Laboratory for the Land and Resources Strategic Studies, Ministry of Land and Resources, Wuhan 430074, Chin)
出处
《系统工程理论与实践》
EI
CSSCI
CSCD
北大核心
2018年第5期1284-1299,共16页
Systems Engineering-Theory & Practice
基金
国家自然科学基金(71573237)
教育部新世纪优秀人才支持计划(NCET-13-1012)
教育部人文社会科学研究规划基金(15YJA630019)
湖北省自然科学基金(2016CFB503)
中国地质环境监测院项目(000121 2016CC600133)~~
关键词
不均衡数据
石油储层
差分演化
自适应学习
集成学习
imbalanced data
oil reservoir
differential evolution
adaptive learning
ensemble learning