摘要
针对基因表达谱的高维、小样本及高噪声等特点,提出一种选择性集成分类方法。首先,采样改进的分类信息指数法进行属性约简,剔除大量无效基因实现降维;然后,基于bootstrap技术的样本扰动和核模糊粗糙集的特征扰动构建多个样本子集,训练多个基分类器;最后,采用教与学优化算法构建选择性集成分类器。仿真实验结果表明,算法在分类精度、集成规模及稳定性等方面具有较强优势。
In viewof the characteristics of small samples, high dimensions and high noise in gene expression profiles, a selective ensemble algorithm was proposed to classify gene expression profiles. Firstly, an improved information index to classification is proposed to reduce the gene expression profile in order to eliminate invalid genes and reduce the dimensionality of feature space. Then, the double disturbances based on bootstrap technique and kernelized fuzzy rough set algorithmis used to construct multiple subset of samples and training multiple base classifiers.
Finally, teaching-learning-based optimization algorithm is applied to construct a selective ensemble classifier. Simulation results show that the proposed algoritiim has strong advantages in classification accuracy, ensemble size and stability.
作者
陈涛
CHEN Tao(School of Mathematics and Computer Science,Shaanxi University of Technology,Hanzhong 723000,Chin)
出处
《科学技术与工程》
北大核心
2018年第21期232-238,共7页
Science Technology and Engineering
基金
国家自然科学基金(11502132)
陕西省教育厅科研基金(14JK1148)
陕西理工大学科研基金(SLGQD2017-07)资助
关键词
基因表达谱
选择性集成
多类别分类信息指数法
核模糊粗糙集
教与学优化算法
gene expression profile
selective ensemble
multiclass information index to classification
kemelized fuzzy rough sets
teaching-learning-based optimization