Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes. Biologists frequently face the problem of choosing an appro...Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes. Biologists frequently face the problem of choosing an appropriate algorithm. We aimed to provide a standalone, easily accessible and biologically oriented criterion for expression data clustering evaluation. Methods An external criterion utilizing annotation based similarities between genes is proposed in this work. Gene ontology information is employed as the annotation source. Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed. Results The rank of these algorithms given by the criterion coincides with our common knowledge. Single-linkage has significantly poorer performance, even worse than the random algorithm. Ward's method archives the best performance in most cases. Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements. It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters. As an addition, we suggest using Ward's algorithm for gene expression data analysis.展开更多
The impaired autonomic nervous system(ANS) has a close relationship to morbidity and mortality for congestive heart failure(CHF). This study is aimed to investigate the possibility to characterize CHF by the pattern o...The impaired autonomic nervous system(ANS) has a close relationship to morbidity and mortality for congestive heart failure(CHF). This study is aimed to investigate the possibility to characterize CHF by the pattern of diurnal rhythm based on heart rate variability(HRV). Two datasets of CHF(n=44) were from Physio Net. And the datasets of the normal from THEW(n=189). Two 2 h episodes representing day and night in resting state were selected in each Holter record. Measures concerning time domain, AR model-based analysis, symbol dynamic analysis, and non-Gaussian indexes(λ) were calculated in each episode. The diurnal rhythm was represented by the ratio of an index in the day to that at night. Results demonstrated different patterns of diurnal rhythm among the normal, mild CHF(NYHAI-Ⅱ) and severe CHF(NYHA Ⅲ-Ⅳ),reflecting the changes in sympathetic and vagal interaction from reciprocal function to accentuated antagonism due to CHF. Furthermore, using RRIn,(LFnu)d/(LFnu)nand λd/λn,the sensitivity and specificity for discriminating the normal and CHF reached 95.45%and 95.24%;And for discriminating between mild CHF and severe CHF were 84.38%and 91.67%. Our proposed method is promising in assessing the ANS state and monitoring therapeutic effects for CHF patients.展开更多
文摘Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes. Biologists frequently face the problem of choosing an appropriate algorithm. We aimed to provide a standalone, easily accessible and biologically oriented criterion for expression data clustering evaluation. Methods An external criterion utilizing annotation based similarities between genes is proposed in this work. Gene ontology information is employed as the annotation source. Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed. Results The rank of these algorithms given by the criterion coincides with our common knowledge. Single-linkage has significantly poorer performance, even worse than the random algorithm. Ward's method archives the best performance in most cases. Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements. It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters. As an addition, we suggest using Ward's algorithm for gene expression data analysis.
基金National Natural Science Foundation of Chinagrant number:81471746,81071225+1 种基金Innovation Project of Medicine and Health Science and Technology of Chinese Academy of Medical Sciencesgrant number:2016-12M-3-08。
文摘The impaired autonomic nervous system(ANS) has a close relationship to morbidity and mortality for congestive heart failure(CHF). This study is aimed to investigate the possibility to characterize CHF by the pattern of diurnal rhythm based on heart rate variability(HRV). Two datasets of CHF(n=44) were from Physio Net. And the datasets of the normal from THEW(n=189). Two 2 h episodes representing day and night in resting state were selected in each Holter record. Measures concerning time domain, AR model-based analysis, symbol dynamic analysis, and non-Gaussian indexes(λ) were calculated in each episode. The diurnal rhythm was represented by the ratio of an index in the day to that at night. Results demonstrated different patterns of diurnal rhythm among the normal, mild CHF(NYHAI-Ⅱ) and severe CHF(NYHA Ⅲ-Ⅳ),reflecting the changes in sympathetic and vagal interaction from reciprocal function to accentuated antagonism due to CHF. Furthermore, using RRIn,(LFnu)d/(LFnu)nand λd/λn,the sensitivity and specificity for discriminating the normal and CHF reached 95.45%and 95.24%;And for discriminating between mild CHF and severe CHF were 84.38%and 91.67%. Our proposed method is promising in assessing the ANS state and monitoring therapeutic effects for CHF patients.