基于信息准则的基因选取方法及其在肿瘤诊断中的应用被引量：1

The Information Criterion Based Gene Selection Method and Its Application to Tumor Diagnosis

下载PDF

导出

摘要大规模基因表达谱为肿瘤诊断提供了更为可靠和细致的生物数据,但相关基因的选取是对这些数据进行分析的关键。本文从Kullback-Leiber判别信息的角度对于肿瘤相关基因的选取进行了研究。根据肿瘤相关基因和无关基因的表达水平值分布的特性,我们提出了一种基于信息准则的基因选取方法。进一步,我们将这种方法应用到肿瘤诊断上,并根据支持向量机(SVM)对相关基因表达谱数据进行训练建立肿瘤诊断模型。实验结果表明这种方法是有效的,依此所建立的诊断模型可使得在结肠癌数据集和白血病数据集上的诊断(预测)正确率分别高达94.4%和100%石。 Large scale gene expression profiles have provided more reliable and detailed biological information for tumor diagnosis. However, the key to analysis of these biological data is to find out the genes that are related to a tumor. In this paper, we study this gene selection problem from a pointview of Kullback-Leiber discrimination information. According to the characteristics of the probability distributions of the related and unrelated gene expression values to a tumor, we propose an information criterion based gene selection method. Then, we construct the tumor diagnosis system by the support vector machine trained on the set of the related gene expression profiles. It is demonstrated by the experiments that the information criterion based gene selection method is efficient and the constructed tumor diagnosis system can reach 94.4% correctness rate of diagnosis on colon datased and 100% correctness rate of diagnosis on leukemia dataset, respectively.

作者葛菲马尽文

机构地区北京大学数学科学学院信息科学系

出处《信号处理》 CSCD 北大核心 2005年第3期312-315,共4页 Journal of Signal Processing

基金国家自然科学基金项目60071004资助

关键词肿瘤诊断选取方法信息准则支持向量机(SVM) 相关基因诊断模型基因表达谱生物数据方法应用数据集值分布谱数据正确率白血病结肠癌 gene expression profiles kullback-leiber discrimination information parzen window support vector machines (SVM) tumor diagnosis

分类号 TP319 [自动化与计算机技术—计算机软件与理论] TS972.23 [轻工技术与工程]

引文网络
相关文献

参考文献20

1T.R.Golub, D.K.Slonim, ETamayo, et al.,"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring," Science,vol.286, pp: 531-537, 1999.
2T.S.Furey, N.Cristianini, N.Duffy, D.W.Bednarski,M.Schummer, and D.Haussler, "Support vector machine classification and validation of cancer tissue samples using microarray expression data," Bioinformatics, vol. 16,pp:906-914, 2000.
3D.K.Slonim, P.Tamayo, J.P.Mesirov, T.R.Golub, and E.S.Lander, "Class prediction and discovery using gene expression data," Proceedings of the 4tth Annual International Conference on Computational Molecular Biology (RECOMB'00), Tokyo, Japan, April 8-11, 2000,pp:263-272.
4D.Nguyen and D.Rocke, "Tumor classification by partial least squares using microarray gene expression data,"Bioinformatics, vol. 18, pp: 39-50, 2002.
5C.Ding, "Analysis of gene expression profiles: clas sdiscovery and leaf ordering," Proceedings of the 6th Annual International Conference on Computational Molecular Biology (RECOMB'02), Washington, DC,USA, April 18-21, 2002, pp:601-608.
6A.Ben-Dor, N.Friedman, and Z. Yakhini, "Scoring genes for relevance," Agilent Technical Report, no.AGL-2000-13, 2000.
7A. Ben-Dor, L. Bruhn, N.Schummer, and Z.Yakhini,Friedman, I. Nachman, M."Tissue classification with Gene Expression Profiles," J. Commputational Biology,vol.7, pp:559-584, 2000.
8S.Dudoit, J.Fridlyand, and T.Speed, "Comparison of discrimination methods for the classification of tumors using gene expression data," Technical report, no.576,University of California at Berkeley, 2000.
9EJ.Park, M.Pagano, and M.Bonetti, "A nonparametric scoring algorithm for identifying informative genes from microarray data," Proceedings of the 6th Pacific Symposium on Biocomputing (PSB 2001), Hawaii, USA,January 3-7, 2001, vol.6, pp: 52-63, 2001.
10A.D.Keller, M.Schummer, L.Hood, and W.L.Ruzzo,"Bayesian classification of DNA array expression data,"Technical Report UW-CSE-2000-08-01, 2000.

二级参考文献13

1Golub T R, Slonim D K, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 1999, 286:531～537
2Alon U, Barkai N, Notterman D A, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Nat'l Acad Sci USA, 1999, 96:6745～6750
3Brown M P S, Grundy W N, Lin D, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Nat'l Acad Sci, 2000, 97(1): 262～267
4Dudoit S, Fridyand J, Speed T P. Comparison of discrimination methods for the classification of tumor using gene expression data.Journal of American Statistical Association, 2002, 97(457): 77～87
5Furey T, Cristianini N, Duffy N, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 2000, 16(10): 909～914
6Guyon I, Weston J, Barnhill S, et al. Gene selection for cancer classification using support vector machine. Machine Learning,2002, 46(1/3): 389～422
7Pavlidis P, Weston J, Cai J, et al. Gene functional Analysis from heterogeneous data. Proc Fifth Int. Conf. on Computational Molecular Biology. New York: ACM Press, 2001. 249～255
8Ding H Q. Analysis of gene expression profiles: class discovery and leaf ordering. In: Proc RECOMB, 2002. 127～136
9Goulden C H. Methods of Statistical Analysis. (2nd edition). New York: John Wiley & Sons, 1956
10Hettmansperger T P. Statistical Inference Based on Ranks, New York: John Wiley & Sons, Inc, 1984

共引文献17

1马尽文,邓明华.第五讲生物医学信息处理——DNA微阵列数据在医学中的应用[J].物理,2005,34(5):371-380.
2阮晓钢,李颖新,李建更,龚道雄,王金莲.基于基因表达谱的肿瘤特异基因表达模式研究[J].中国科学（C辑）,2006,36(1):86-96. 被引量：5
3孟范静,刘毅慧,王洪国,成金勇.SVM在基因微阵列癌症数据分类中的应用[J].计算机工程与应用,2007,43(34):246-248. 被引量：2
4王树林,王戟,陈火旺,李树涛,张波云.肿瘤信息基因启发式宽度优先搜索算法研究[J].计算机学报,2008,31(4):636-649. 被引量：17
5孟范静,刘毅慧,王洪国,成金勇.遗传优化算法在基因数据分类中的应用[J].生物信息学,2008,6(3):119-122. 被引量：2
6钟来平,周晓健,魏魁杰,杨筱,马春跃,张陈平,张志愿.血清肿瘤标志物结合支持向量机模型在口腔鳞癌诊断中的应用[J].上海口腔医学,2008,17(5):457-460. 被引量：5
7李春明,张会儒.秩和检验法在检验间伐对林分生长影响研究中的应用[J].林业科学研究,2008,21(6):757-760.
8皋军,王士同,邓赵红.广义的势支撑特征选择方法GPSFM[J].计算机研究与发展,2009,46(1):41-51. 被引量：6
9皋军,王士同.基于矩阵模式的最小类内散度支持向量机[J].电子学报,2009,37(5):1051-1057. 被引量：7
10黄伟,尹京苑.一种基于支持向量机的自适应肿瘤分类检测算法[J].生物信息学,2009,7(4):243-247.

同被引文献1

1邓林,马尽文,裴健.秩和基因选取方法及其在肿瘤诊断中的应用[J].科学通报,2004,49(13):1311-1316. 被引量：18

引证文献1

1马尽文,邓明华.第五讲生物医学信息处理——DNA微阵列数据在医学中的应用[J].物理,2005,34(5):371-380.

1兰志霞,赵联文,刘赪.基于改进评价准则的贝叶斯网络模型选择[J].绵阳师范学院学报,2015,34(8):44-48.
2刘福才,路平立,裴润.快速聚类和统计信息优化准则在模糊建模中的应用[J].仪器仪表学报,2005,26(4):422-424. 被引量：1
3夏飞,张玲芳.信息准则在信息融合中的应用[J].计算机仿真,2008,25(3):218-220. 被引量：1
4桂林,武小悦.基于DIC的HMT模型选择在故障诊断中的应用[J].微计算机信息,2008,24(19):194-195.
5韩敏,任伟杰,许美玲.一种基于L_1范数正则化的回声状态网络[J].自动化学报,2014,40(11):2428-2435. 被引量：13
6樊凌,龚伟.无线网络MOOCs大数据聚类方法优化研究[J].计算机仿真,2016,33(7):435-439. 被引量：10
7王秀峰,劳育红.非线性动态系统模型结构确定和参数估计新算法[J].自动化学报,1992,18(4):385-392. 被引量：9
8冯华.现代紫砂壶的选取和收藏[J].中国检验检疫,2001(11):56-56.
9王秀峰,李波.随机非线性系统辨识的正交优选算法[J].自动化学报,1993,19(3):264-270. 被引量：1
10胡光华.信息准则与统计建模[J].国外科技新书评介,2008(3):7-8.

信号处理

2005年第3期

浏览历史

内容加载中请稍等...

基于信息准则的基因选取方法及其在肿瘤诊断中的应用被引量：1

参考文献20

二级参考文献13

共引文献17

同被引文献1

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于信息准则的基因选取方法及其在肿瘤诊断中的应用 被引量：1

参考文献20

二级参考文献13

共引文献17

同被引文献1

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于信息准则的基因选取方法及其在肿瘤诊断中的应用被引量：1