融合子集特征级联预学习的封装方法研究

A Wrapper Method for Combining Subset Feature with Cascade Pre-learning

下载PDF

导出

摘要机器学习领域中的特征选择算法可简化模型输入,提高可解释性并帮助避免维度灾难及过拟合现象的发生.针对基于封装法进行特征选择时,评价模型通常将搜索出的特征子集直接作为输入,导致算法对特征利用和评估效果受限于评价模型的特征学习能力,限制了对更适特征子集的发现能力等问题,提出一种基于级联森林结构的子集特征预学习封装法.该方法在搜索算法与评价模型之间添加多层级联森林,重构待评价特征子集为高级特征集,降低评价模型模式识别难度,提高对子集性能的评价效果.实验对比了多种搜索算法及评价模型组合,本方法可在保证分类性能的前提下,进一步降低所选特征数量,同时维持了封装法的低耦合性. Feature selection algorithms in the machine learning domain can simplify the input of model,improve interpretability,and avoid dimensional catastrophe and over-fitting.In terms of selecting features on wrapper methods,the evaluation of models usually take the feature subsets filtered by the search algorithm as input directly,which leads to the fact that feature exploitation and evaluation of models is restricted by the feature reconstruction and fitting ability of the evaluation model.Moreover,the more appropriate feature subsets were limited to be discovered either.To solve the problems,a pre-learning wrapper method was proposed based on the cascade forest structure.Adding multi-level cascade forest between the search algorithm and the evaluation model,the model was arranged to transform the feature subset as high-level feature set,reducing the difficulty of recognition in the evaluation and improving the performance of feature subset.In contrast experiment,a variety of search algorithms and evaluation model combinations were evaluated on multiple datasets.The results indicate that the proposed method can reduce the number of selected features,while maintaining classification performance and the low coupling property of wrapper methods.

作者潘丽敏佟彤罗森林秦枭喃 PAN Limin;TONG Tong;LUO Senlin;QIN Xiaonan(Information System & Security and Countermeasures Experiments Center, Beijing Institute of Technology, Beijing 100081,China)

机构地区北京理工大学信息系统及安全对抗实验中心

出处《北京理工大学学报》 EI CAS CSCD 北大核心 2021年第11期1201-1206,共6页 Transactions of Beijing Institute of Technology

基金国家“十三五”科技支撑计划项目(SQ2018YFC200004) 国家卫生部卫生行业科研专项基金项目(201302008)。

关键词特征选择封装法级联森林特征学习 feature selection wrapper method cascade forest feature learning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1姚旭,王晓丹,张玉玺,权文.特征选择方法综述[J].控制与决策,2012,27(2):161-166. 被引量：207
2林棋,张宏,李千目.一种基于MA-LSSVM的封装式特征选择算法[J].南京理工大学学报,2016,40(1):10-16. 被引量：7
3汪文勇,刘川,赵强,沈晓明,丘晓彤.直接验证的封装式特征选择方法[J].电子科技大学学报,2016,45(4):607-615. 被引量：7
4Jinkun LIU,Fuchun SUN.A novel dynamic terminal sliding mode control of uncertain nonlinear systems[J].控制理论与应用（英文版）,2007,5(2):189-193. 被引量：17

二级参考文献107

1庄开宇,苏宏业,张克勤,褚健.Adaptive terminal sliding mode control for high-order nonlinear dynamic systems[J].Journal of Zhejiang University Science,2003,4(1):58-63. 被引量：11
2孙艳丰.基于遗传算法和禁忌搜索算法的混合策略及其应用[J].北京工业大学学报,2006,32(3):258-262. 被引量：29
3Li G-Z, Yang J Y. Feature selection for ensemble learning and its application[M]. Machine Learning in Bioinformatics, 2008: 135-155.
4Sheinvald J, Byron Dom, Wayne Niblack. A modelling approach to feature selection[J]. Proc of 10th Int Conf on Pattern Recognition, 1990, 6(1): 535-539.
5Cardie C. Using decision trees to improve case-based learning[C]. Proc of 10th Int Conf on Machine Learning. Amherst, 1993: 25-32.
6Modrzejewski M. Feature selection using rough sets theory[C]. Proc of the European Conf on Machine ,Learning. 1993: 213-226.
7Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data[J]. J of Bioinformatics and Computational Biology, 2005, 3(2): 185-205.
8Francois Fleuret. Fast binary feature selection with conditional mutual information[J]. J of Machine Learning Research, 2004, 5(10): 1531-1555.
9Kwak N, Choi C-H. Input feature selection by mutual information based on Parzen window[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(12): 1667-1671.
10Novovicova J, Petr S, Michal H, et al. Conditional mutual information based feature selection for classification task[C]. Proc of the 12th Iberoamericann Congress on Pattern Recognition. Valparaiso, 2007: 417-426.

共引文献234

1朱小培,位云朋,闫李,韩茜茜.基于多模态进化计算的特征选择策略[J].中原工学院学报,2021,32(4):71-76.
2赵小强,牟淼.基于变量分块的KDLV-DWSVDD间歇过程故障检测算法研究[J].仪器仪表学报,2021,42(2):244-256. 被引量：7
3唐易,陈奕希,喻洪流,石萍.一种面向下肢假肢的运动意图识别方法及验证[J].信息与控制,2023,52(5):598-606. 被引量：1
4赵洪,沈建忠,王俊,张骋,瞿青.基于客户画像与机器学习算法的电费回收风险预测模型及应用[J].微型电脑应用,2020,36(2):93-96. 被引量：12
5蒋月,Shaker ul Din,刘勇,张寅丹,刘巨峰,陆海霞.一种集成多分类器的面向地理对象遥感影像变化回溯分析方法[J].兰州大学学报（自然科学版）,2020(5):666-676. 被引量：1
6叶志伟,王巧,周雯,王明威,蔡婷,何其祎.进化计算在大规模高维特征选择中的应用综述[J].北方工业大学学报,2024,36(2):8-19.
7崔文岩,孟相如,李纪真,王明鸣,陈天平,王坤.基于粗糙集粒子群支持向量机的特征选择方法[J].微电子学与计算机,2015,32(1):120-123. 被引量：9
8梁家荣,肖剑,樊仲光.奇异系统的终端滑模变结构控制[J].应用科学学报,2009,27(3):299-304. 被引量：1
9王晓宁,王永忠,王益红.RPJ-X型喷浆机器人大臂的动态Terminal滑模控制[J].煤矿机械,2009,30(10):49-51.
10梁家荣,谭红艳,樊仲光.广义系统的快速终端滑模控制[J].电子科技大学学报,2011,40(1):11-15. 被引量：8

1魏海茹,薛鹏,邓国政,薛晓光.环境空气臭氧时空变化特征与气象因子相关关系研究[J].环境科学与管理,2021,46(11):76-80. 被引量：2
2李超.森林培育中森林抚育间伐中存在的问题探讨[J].南方农业,2021,15(30):132-133. 被引量：8
3宋璟,邸丽清,杨光,都婧.新时代下数据安全风险评估工作的思考[J].中国信息安全,2021(9):62-65. 被引量：10
4张伟,张广帅,王连彪.基于CNN-GRU网络的轴承故障检测算法[J].工业仪表与自动化装置,2021(6):88-91. 被引量：8
5孟伟,代小燕,姚华忆,徐施为.龙里林场马尾松现有林改培模式研究[J].内蒙古林业调查设计,2021,44(5):21-23. 被引量：2
6任钰,刘全金,黄忠,胡浪涛,刘国明.基于Faster R-CNN与迁移学习的口罩佩戴检测算法[J].安庆师范大学学报（自然科学版）,2021,27(4):25-30. 被引量：6
7韩越林,王小玉.多头自注意力在双曲空间下的点击率预测[J].北京邮电大学学报,2021,44(5):127-132. 被引量：2
8唐茂俊,黄海松,张松松,范青松.改进的Faster-RCNN在焊缝缺陷检测中的应用[J].组合机床与自动化加工技术,2021(12):83-86. 被引量：11
9韩大校,王烁,刘亭亭,王剑南,王千雪.大树在森林中的作用和影响因素的研究进展[J].温带林业研究,2021,4(4):4-10. 被引量：1
10周义飏.基于VGG的人脸表情识别与分类[J].智能计算机与应用,2021,11(9):35-41. 被引量：3

北京理工大学学报

2021年第11期

浏览历史

内容加载中请稍等...

融合子集特征级联预学习的封装方法研究

参考文献4

二级参考文献107

共引文献234

相关作者

相关机构

相关主题

浏览历史