摘要
甲状腺释放甲状腺激素以调节人体的新陈代谢速率,甲状腺激素过多或过少分别会引起甲亢或甲减,都属于甲状腺疾病.在实际医疗数据中,甲状腺疾病数据属于典型的不均衡数据.传统的分类方法往往忽略了不均衡数据存在的异构现象(不均衡程度,特征维度,类别数目在不同数据集中各不相同).针对甲状腺疾病数据的类分布不均衡现象以及异构现象,本文提出一种自适应多分类器系统(adaptive multiple classi6er system,AMCS),构造多分类器集成系统,自适应地对异构不均衡甲状腺疾病数据进行分类来辅助甲状腺疾病的诊断.AMCS系统包括特征选择,集成框架,基分类器以及集成规则四个组成部分,每一组成部分由不同的算法组成候选池,根据不同数据存在的异构现象,自适应地为异构数据选择最优集成算法.本文采用KEEL和UCI提供的10组异构甲状腺疾病数据进行实验,验证了本文所提出的方法在辅助甲状腺疾病诊断的有效性.
Thyroid gland produces thyroid hormones to help the regulation of the body's metabolism. The under-activity and over-activity of thyroid hormone cause hypothyroidism and hyperthyroidism. The malfunction of thyroid hormone will lead to thyroid disease. In real medical conditions, thyroid disease data belong to typical imbalanced data. Traditional machine learning methods ignored the different structure (different imbalanced data varies in imbalanced ratio, dimension and the number of classes) among the imbalanced datasets. Consider the occurrence of imbalance and different structure among the thyroid disease data, this paper proposes an adaptive multiple classifier system (AMCS) for the imbalanced data distribution and variable types in different datasets. An adaptive multiple classifier system is constructed to select the optimal learning model to assist in the diagnosis of thyroid diseases adaptively. The adaptive multiple classifier system (AMCS) was formed by re-sampling, ensemble framework, feature selection, base classifiers and ensemble rules. The most popular algorithms are treated as candidates and the optimal algorithm in each component of MCS will be combined to deal with the classification task in an unique routine. Ten thyroid datasets with diverse structure are taken from KEEL and UCI for the experiment. The experimental results show that performance of the proposed approach is a competitive assistant in thyroid disease diagnosis.GUO Haixiang1'2'a'4, HUANG Yuanyue1, GU Mingyun1, PAN Wenwen1 (1. School of Economics and Management Science, China University of Geosciences, Wuhan 430074, China; 2. Research Center for Digital Business Management, China University of Geosciences, Wuhan 430074, China; 3. Mineral Resource Strategy and Policy Research Center of China University of Geosciences, Wuhan 430074, China; 4. Key Laboratory for the Land and Resources Strategic Studies, Ministry of Land and Resources, Wuhan 430074, China)
作者
郭海湘
黄媛玥
顾明赟
潘雯雯
GUO Haixiang;HUANG Yuanyue;GU Mingyun;PAN Wenwen(School of Economics and Management Science,China University of Geosciences,Wuhan 430074,China;Research Center for Digital Business Management,China University of Geosciences,Wuhan 430074,China;Mineral Resource Strategy and Policy Research Center of China University of Geosciences,Wuhan 430074,China;Key Laboratory for the Land and Resources Strategic Studies,Ministry of Land and Resources,Wuhan 430074,China)
出处
《系统工程理论与实践》
EI
CSSCI
CSCD
北大核心
2018年第8期2123-2134,共12页
Systems Engineering-Theory & Practice
基金
国家自然科学基金(71573237)
教育部新世纪优秀人才支持计划(NCET-13-1012)
教育部人文社会科学研究规划基金(15YJA630019)
湖北省自然科学基金(2016CFB503)~~
关键词
不均衡分类
多分类器系统
自适应学习
甲状腺疾病诊断
imbalanced classification
multiple classifier system
adaptive learning
thyroid disease diag-nosis