In biomedical research,in order to evaluate the effect of a drug,investigators often need to compare the differences between one treatment group and another one by using multiple outcomes.The rank-sum tests can handle...In biomedical research,in order to evaluate the effect of a drug,investigators often need to compare the differences between one treatment group and another one by using multiple outcomes.The rank-sum tests can handle the case where the outcome differences between two groups are in the same direction.If they are not,MAX can handle it and is very useful when one/some of the differences is/are relatively larger than the others.When the individual outcome difference between two groups is moderate,a new method,summation of the absolute value of rank-based test for each outcome,is proposed in this work.Power comparison with the existing methods based on simulation studies and a real example show that the proposed test is a robust test,and works well when the difference for each outcome is moderate.The authors also derive some theoretical results for comparing the power between MAX and the the proposed method.展开更多
In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular h...In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calcu- late the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we pre- sented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molec- ular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We foundthat approximated P values were generally higher than the exact solution provided by EDISON- WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http:// www.ccb.uni-saarland.de/software/wtest/.展开更多
Protein phosphorylation plays an important role in various cellular processes. Due to its high complexity, the mechanism needs to be further studied. In the last few years, many methods have been contributed to this f...Protein phosphorylation plays an important role in various cellular processes. Due to its high complexity, the mechanism needs to be further studied. In the last few years, many methods have been contributed to this field, but almost all of them investigated the mechanism based on protein sequences around protein sites. In this study, we implement an exploration by characterizing the microenvironment surrounding phosphorylated protein sites with a modified shell model, and obtain some significant properties by the rank-sum test, such as the lack of some classes of residues, atoms, and secondary structures. Furthermore, we find that the depletion of some properties affects protein phosphorylation remarkably. Our results suggest that it is a meaningful direction to explore the mechanism of protein phosphorylation from microenvironment and we expect further findings along with the increasing size of phosphorylation and protein structure data.展开更多
基金partially supported by by the National Young Science Foundation of China under No.10901155the National Social Science Foundation of China under No.10CTJ004
文摘In biomedical research,in order to evaluate the effect of a drug,investigators often need to compare the differences between one treatment group and another one by using multiple outcomes.The rank-sum tests can handle the case where the outcome differences between two groups are in the same direction.If they are not,MAX can handle it and is very useful when one/some of the differences is/are relatively larger than the others.When the individual outcome difference between two groups is moderate,a new method,summation of the absolute value of rank-based test for each outcome,is proposed in this work.Power comparison with the existing methods based on simulation studies and a real example show that the proposed test is a robust test,and works well when the difference for each outcome is moderate.The authors also derive some theoretical results for comparing the power between MAX and the the proposed method.
文摘In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calcu- late the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we pre- sented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molec- ular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We foundthat approximated P values were generally higher than the exact solution provided by EDISON- WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http:// www.ccb.uni-saarland.de/software/wtest/.
基金supported by the National Key Technologies R&D Program(No.2004BA711A21)the National Natural Science Foundation of China(No.60275007 and 60234020
文摘Protein phosphorylation plays an important role in various cellular processes. Due to its high complexity, the mechanism needs to be further studied. In the last few years, many methods have been contributed to this field, but almost all of them investigated the mechanism based on protein sequences around protein sites. In this study, we implement an exploration by characterizing the microenvironment surrounding phosphorylated protein sites with a modified shell model, and obtain some significant properties by the rank-sum test, such as the lack of some classes of residues, atoms, and secondary structures. Furthermore, we find that the depletion of some properties affects protein phosphorylation remarkably. Our results suggest that it is a meaningful direction to explore the mechanism of protein phosphorylation from microenvironment and we expect further findings along with the increasing size of phosphorylation and protein structure data.