摘要
在冗余和相关分析的基础上,进行特征选择和结合流形学习理论,提出了一种开展系统研究的模型,并对5个基因表达数据集(NCI、Lymphoma、Lung、Leukemia、Colon)开展了疾病分类研究。实验结果表明,这种建模系统在降低数据处理计算量的同时,能有助于特征基因数目的确定,并进而提高疾病分类的准确度,在诊断和个体化治疗方案的制定方面有着很好的应用前景。
By combining with informatics theory, ta system model consisting of feature selection which is based on redundancy and correlation is presented to develop disease classification research with five gene data set(NCI, Lymphoma, Lung, Leukemia, Colon). The result indicates that this modeling method can not only reduce data management computation amount, but also help confirming amount of features, further more improve classification accuracy, and the application of this model has a bright foreground in fields of disease analysis and individual treatment project establishment.
出处
《中国医疗器械杂志》
CAS
2012年第4期248-251,共4页
Chinese Journal of Medical Instrumentation
基金
国家自然科学基金资助
项目编号:60971044
关键词
基因
特征选择
信息学
流形学习
疾病分类
gene, feature selection, informatics theory, manifold learning, disease classification