摘要
针对基因表达谱数据特有的维数高、样本小、非线性的特点,对基因特征提取和分类进行研究,提出将Lo-gistic回归和T检验方法引入基因的特征提取过程。通过Logistic回归初步筛选基因,T-test检验二次筛选特征基因,针对提取的特征构建分类器,得到提取的特征最少、分类效果最好的判别模型。建立分类模型的方法取得良好的癌症分类效果,具有很好的生物解释意义,为寻找致病基因提供了重要依据。
Based on the research of gene feature extraction andits classification,this paper introduces the Logistic regression and T test method into gene feature extraction process.Specifically,through the Logistic regression preliminary selecting in gene,T-test inspection secondary screening genetic characteristics,and finally building classifier according to the extracted characteristics,this paper comes to a conclusion with the best discriminatory analysis under which the extracted characteristics is least,but classification effect is the best.
出处
《桂林电子科技大学学报》
2012年第1期69-71,81,共4页
Journal of Guilin University of Electronic Technology
基金
广西信息与通讯技术重点实验室主任基金(PF090109)