期刊文献+

基于属性代表的多粒度集成分类算法 被引量:4

Multi-Granularity Ensemble Classification Algorithm Based on Attribute Representation
下载PDF
导出
摘要 面对复杂多变的信息系统,传统的机器学习多分类模型无法实现一个动态分类的过程.序贯三支决策作为一种多粒度分类算法,常用于解决多粒度空间下动态分类问题.然而,序贯三支决策在粗粒度空间下容易产生决策冲突,在细粒度空间下要考虑很多属性导致其分类效率不高以及无法对最终未分类对象进行处理.因此,本文结合集成学习和粒计算的思想提出了一种基于属性代表的多粒度集成分类算法.首先,通过选择每一粒层中分类能力较强的属性作为属性代表来构建分类器,形成基于属性代表的集成分类器.其次,通过评分表保留粗粒度空间下分类器的分类意见以减少细粒度下需要考虑的属性个数.最后,采用“相对最优”的策略,将反对率最少的决策类作为最终未分类对象的分类结果.通过实验验证,本文方法相比于序贯三支决策以及其他机器学习的多分类算法具有较好的鲁棒性、分类效率以及分类性能. In the face of the complex and changeable information systems,in the field of machine learning traditional multi-classification models cannot achieve a dynamic classification process,and it cannot solve some problems such as disease diagnosis.Because some diagnosis procedures are too expensive,it is necessary to judge whether the patient is likely to be ill through some preliminary diagnosis,thereby reducing the cost of the process.Sequential three-way decisions as a multi-granularity classification algorithm,which is used to solve dynamic classification problems in multi-granularity space.The sequential three-way decision model sorts attributes by balancing the cost of decision results and decision process,then a multi-level granularity space is constructed.With the injection of information in turn,objects that meet the conditions are classified at different granularity levels.It can be said that the sequential three-way decision model solves the problem of excessive costs for decision process.Therefore,many scholars at home and abroad have optimized the sequential three-way decision model from perspective of cost-sensitive.However,in some cases,the sequential three-way decision model in the coarse granularity space is prone to decision conflicts,that is,the same object gets multiple different classification results.Therefore,many attributes must be considered in the fine granularity space,which leads to low classification efficiency.Because of the lack of more information and corresponding strategies,the sequential three-way decision model is unable to process the final unclassified objects.Therefore,this paper combines the ideas of ensemble learning and granular computing to propose a multi-granularity ensemble classification algorithm based on attribute representation.Firstly,constructing a classifier by selecting the attribute representatives of each granularity layer to form an ensemble classifier based on attribute representatives.By synthesizing the different opinions of the classifier which is constructed by attribute representation,the generation of decision conflicts in each granularity layer can be effectively reduced.Secondly,the classification opinions of classifiers in the coarse granularity space are retained through the scoring table to reduce the number of attributes that need to be considered in the fine granularity space.The retained score can make the ensemble classifier which is constructed by attribute representation in the fine granularity space to avoid more likely errors,thereby obtaining a more confident classification result.Finally,there may still be cases where some objects are not classified after all the information has been injected,so the“relatively optimal”strategy is adopted,and the decision class with the least objection rate is used as the final classification result of unclassified objects.In order to verify the validity of the model in this paper,the 14 UCI data sets and 6 real data sets which are related to medical diagnosis are used to conduct horizontal and vertical comparison experiments respectively.Among them,the horizontal comparison experiment includes ten popular multi-classification algorithms.Through experiments,the proposed method in this paper has better robustness,classification efficiency and classification performance than the sequential three-way decisions and other machine learning multi-classification algorithms.Moreover,the multi-granularity ensemble classification algorithm based on attribute representation has improved significantly in the real data sets of medical diagnosis.
作者 张清华 支学超 王国胤 杨帆 薛付忠 ZHANG Qing-Hua;ZHI Xue-Chao;WANG Guo-Yin;YANG Fan;XUE Fu-Zhong(Key Laboratory of Tourism Multisource Data Perception and Decision,Ministry of Culture and Tourism,Chongqing 400065;Chongqing Key Laboratory of Computational Intelligence,Chongqing University of Posts and Telecommunications,Chongqing 400065;School of Public Health,Shandong University,Shandong 25000)
出处 《计算机学报》 EI CAS CSCD 北大核心 2022年第8期1712-1729,共18页 Chinese Journal of Computers
基金 国家重点研究发展计划(2020YFC2003502) 国家自然科学基金(61876201) 重庆市自然科学基金(cstc2019jcyj-cxttX0002)资助.
关键词 动态分类 序贯三支决策 集成学习 属性代表 多粒度 dynamic classification sequential three-way decisions ensemble learning attribute representation multi-granularity
  • 相关文献

参考文献9

二级参考文献139

共引文献647

同被引文献33

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部