Tumor diagnosis by analyzing gene expression profiles becomes an interesting topic in bioinformatics and the main problem is to identify the genes related to a tumor. This paper proposes a rank sum method to identify ...Tumor diagnosis by analyzing gene expression profiles becomes an interesting topic in bioinformatics and the main problem is to identify the genes related to a tumor. This paper proposes a rank sum method to identify the re- lated genes based on the rank sum test theory in statistics. The tumor diagnosis system is constructed by the support vector machine (SVM) trained on the set of the related gene expression profiles. The experiments demonstrate that the constructed tumor diagnosis system with the rank sum method and SVM can reach an accuracy level of 96.2% on the colon data and 100% on the leukemia data.展开更多
The rank-sum test is a nonparametric method used in variety evaluation. However, the hypothesis testing of the method hasn't been established for multi-trait comprehensive ranking. In this paper, under null hypothesi...The rank-sum test is a nonparametric method used in variety evaluation. However, the hypothesis testing of the method hasn't been established for multi-trait comprehensive ranking. In this paper, under null hypothesis H0: the variety's ranking on each trait is random, the theoretical distribution of sum of ranks (SR) was firstly derived and further used to obtain the critical values for multi-trait comprehensive evaluation in rank-sum testing. A new C++ class and its basic arithmetic were defined to deal with the miscount caused by the precision limitation of built-in data type in common statistical software under large number of varieties and traits. Finally, an application of the theoretical results was demonstrated using five starch viscosity traits of 12 glutinous maize varieties. The proposed method is so simple and convenient that it can be easily used to rank different varieties by multiple traits.展开更多
文摘Tumor diagnosis by analyzing gene expression profiles becomes an interesting topic in bioinformatics and the main problem is to identify the genes related to a tumor. This paper proposes a rank sum method to identify the re- lated genes based on the rank sum test theory in statistics. The tumor diagnosis system is constructed by the support vector machine (SVM) trained on the set of the related gene expression profiles. The experiments demonstrate that the constructed tumor diagnosis system with the rank sum method and SVM can reach an accuracy level of 96.2% on the colon data and 100% on the leukemia data.
基金supported by the National Key Basic Research Program of China(2006CB101700)the Program for New Century Excellent Talents in University of Ministry of Education of China(NCET2005-05-0502)
文摘The rank-sum test is a nonparametric method used in variety evaluation. However, the hypothesis testing of the method hasn't been established for multi-trait comprehensive ranking. In this paper, under null hypothesis H0: the variety's ranking on each trait is random, the theoretical distribution of sum of ranks (SR) was firstly derived and further used to obtain the critical values for multi-trait comprehensive evaluation in rank-sum testing. A new C++ class and its basic arithmetic were defined to deal with the miscount caused by the precision limitation of built-in data type in common statistical software under large number of varieties and traits. Finally, an application of the theoretical results was demonstrated using five starch viscosity traits of 12 glutinous maize varieties. The proposed method is so simple and convenient that it can be easily used to rank different varieties by multiple traits.