期刊文献+

一种用于癌症分类的两阶段深度特征选择提取算法

Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
下载PDF
导出
摘要 癌症是世界上最致命的疾病之一。利用机器学习处理基因微阵列数据集(Microarray Data)对于协助癌症的早期诊断具有重要作用,但微阵列数据集中基因特征的数目远大于样本数目,造成样本不平衡,影响了分类的效率和精度,因此对基因阵列数据进行特征选择就显得尤为重要。现有的特征选择算法多为单一条件的特征选择,很少考虑特征提取,且大多采用存在已久的神经网络,分类精度较低。因此,文中提出了一种两阶段深度特征选择(Two-Stage Deep Feature Selection,TSDFS)算法。第一阶段集成3种特征选择算法进行全面的特征选择,得到特征子集;第二阶段使用非监督神经网络获得特征子集的最佳表示,进而提高最终的分类精度。通过特征选择前后的分类效果和不同特征选择算法之间的对比来分析TSDFS的有效性,实验结果表明,TSDFS在减少特征数目的同时保持或者提高了分类的精度。 Cancer is one of the deadliest diseases in the world.Using machine learning to process microarray data plays an important role in assisting the early diagnosis of cancer,but the numbers of genetic features are much more than samples,leading to an imbalance in the sample,and the efficiency and accuracy of classification are affected,so it is important to select the feature of gene array data.Most of the existing feature selection algorithms are single condition feature selection,which seldom consider feature extraction.Most of them use the long-existing neural network and have low classification accuracy.So,a two-stage deep feature selection(TSDFS)algorithm is proposed.The first stage aggregates three feature selection algorithms for comprehensive feature selection,and feature subsets are obtained.In the second stage,unsupervised neural network is used to obtain the best representation of feature subset and improve the final classification accuracy.This paper analyzes the effectiveness of TSDFS by comparing the classification effect before and after feature selection and different feature selection algorithms.Experimental results show that TSDFS algorithm can reduce the number of features while maintaining or improving the accuracy of classification.
作者 胡艳羽 赵龙 董祥军 HU Yan-yu;ZHAO Long;DONG Xiang-jun(College of Computer Science and Technology,Qilu University of Technology,Jinan 250353,China)
出处 《计算机科学》 CSCD 北大核心 2022年第7期73-78,共6页 Computer Science
基金 国家自然科学基金(62076143,61806105) 山东省自然基金(ZR2017LF020).
关键词 微阵列数据 特征选择 深度学习 随机森林 变分自编码器 Microarray data Feature selection Deep learning Random forest Variational auto-encoder
  • 相关文献

参考文献3

二级参考文献31

共引文献130

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部