Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of mic...Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst's expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly fo- cus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search.展开更多
Background:Blood-based test for predicting disease progression and early diagnosis of Parkinson’s disease(PD)is an unmet need in the clinic.The profiles of microRNAs(miRNAs)are regarded as potential diagnostic biomar...Background:Blood-based test for predicting disease progression and early diagnosis of Parkinson’s disease(PD)is an unmet need in the clinic.The profiles of microRNAs(miRNAs)are regarded as potential diagnostic biomarkers for human diseases,whereas miRNAs in the periphery are susceptible to the influence of various components.MiRNAs enriched in serum extracellular vesicles(EVs)have demonstrated disease-specific advantages in diagnosis due to their high abundance,stability and resistance to degradation.This study was aimed to screen differentially expressed EV-derived miRNAs between healthy controls and PD patients to aid in diagnosis of PD.Methods:A total of 31 healthy controls and 72 patients with a diagnosis of PD at different Hoehn and Yahr stages in Tangdu Hospital were included.In total,185 differentially expressed miRNAs were obtained through RNA sequencing of serum EVs as well as edgeR and t-test analyses.Subsequently,the weighted gene co-expression network analysis(WGCNA)was utilized to identify the commonly expressed miRNAs in all stages of PD by constructing connections between modules,and specifically expressed miRNAs in each stage of PD by functional enrichment analysis.After aligning these miRNAs with PD-related miRNAs in Human miRNA Disease Database,the screened miRNAs were further validated by receiver operating characteristic(ROC)curves and quantitative real-time polymerase chain reaction(qRT-PCR)using peripheral blood EVs from 40 more participants.Results:WGCNA showed that 4 miRNAs were commonly associated with all stages of PD and 13 miRNAs were specifically associated with different stages of PD.Of the 17 obtained miRNAs,7 were validated by ROC curve analysis and 7 were verified in 40 more participants by qRT-PCR.Six miRNAs were verified by both methods,which included 2 miRNAs that were commonly expressed in all stages of PD and 4 miRNAs that were specifically expressed in different stages of PD.Conclusions:The 6 serum EV-derived miRNAs,hsa-miR-374a-5p,hsa-miR-374b-5p,hsa-miR-199a-3p,hsa-miR-28-5p,hsa-miR-22-5p and hsa-miR-151a-5p,may potentially be used as biomarkers for PD progression and for early diagnosis of PD in populations.展开更多
Background: One of the most important and challenging issues in biomedicine and genomics is how to identify disease related genes. Datasets from high-throughput biotechnologies have been widely used to overcome this ...Background: One of the most important and challenging issues in biomedicine and genomics is how to identify disease related genes. Datasets from high-throughput biotechnologies have been widely used to overcome this issue from various perspectives, e.g., epigenomics, genomics, transcriptomics, proteomics, metabolomics. At the genomic level, copy number variations (CNVs) have been recognized as critical genetic variations, which contribute significantly to genomic diversity. They have been associated with both common and complex diseases, and thus have a large influence on a variety of Mendelian and somatic genetic disorders. Results: In this review, based on a variety of complex diseases, we give an overview about the critical role of using CNVs for identifying disease related genes, and discuss on details the different high-throughput and sequencing methods applied for CNV detection. Some limitations and challenges concerning CNV are also highlighted. Conclusions: Reliable detection of CNVs will not only allow discriminating driver mutations for various diseases, but also helps to develop personalized medicine when integrating it with other genomic features.展开更多
基金The authors thank the anonymous referees for their useful comments that greatly improved the quality of the paper. This work was supported in part by the National Basic Research Program 973 of China (2012CB316203), the Natural Science Foundation of China (Grant Nos. 61033007, 61272121, 61332014, 61572367, 61332006, 61472321, and 61502390), the National High Technology Research and Development Program 863 of China (2015AA015307), the Fundational Research Funds for the Central Universities (3102015JSJ0011, 3102014JSJ0005, and 3102014JSJ0013), and the Graduate Starting Seed Fund of Northwestern Polytechnical University (Z2012128).
文摘Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst's expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly fo- cus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search.
基金This work was supported by the National Key Research and Development Program of China(2016YFC1306603 to Q.Y.)the National Natural Science Foundation of China(NSFC+1 种基金31930048 and 31671060 to Q.Y.,and 61972320 to B.C.)the Projects of International Cooperation and Exchange under NSFC(81720108016,Q.Y.).
文摘Background:Blood-based test for predicting disease progression and early diagnosis of Parkinson’s disease(PD)is an unmet need in the clinic.The profiles of microRNAs(miRNAs)are regarded as potential diagnostic biomarkers for human diseases,whereas miRNAs in the periphery are susceptible to the influence of various components.MiRNAs enriched in serum extracellular vesicles(EVs)have demonstrated disease-specific advantages in diagnosis due to their high abundance,stability and resistance to degradation.This study was aimed to screen differentially expressed EV-derived miRNAs between healthy controls and PD patients to aid in diagnosis of PD.Methods:A total of 31 healthy controls and 72 patients with a diagnosis of PD at different Hoehn and Yahr stages in Tangdu Hospital were included.In total,185 differentially expressed miRNAs were obtained through RNA sequencing of serum EVs as well as edgeR and t-test analyses.Subsequently,the weighted gene co-expression network analysis(WGCNA)was utilized to identify the commonly expressed miRNAs in all stages of PD by constructing connections between modules,and specifically expressed miRNAs in each stage of PD by functional enrichment analysis.After aligning these miRNAs with PD-related miRNAs in Human miRNA Disease Database,the screened miRNAs were further validated by receiver operating characteristic(ROC)curves and quantitative real-time polymerase chain reaction(qRT-PCR)using peripheral blood EVs from 40 more participants.Results:WGCNA showed that 4 miRNAs were commonly associated with all stages of PD and 13 miRNAs were specifically associated with different stages of PD.Of the 17 obtained miRNAs,7 were validated by ROC curve analysis and 7 were verified in 40 more participants by qRT-PCR.Six miRNAs were verified by both methods,which included 2 miRNAs that were commonly expressed in all stages of PD and 4 miRNAs that were specifically expressed in different stages of PD.Conclusions:The 6 serum EV-derived miRNAs,hsa-miR-374a-5p,hsa-miR-374b-5p,hsa-miR-199a-3p,hsa-miR-28-5p,hsa-miR-22-5p and hsa-miR-151a-5p,may potentially be used as biomarkers for PD progression and for early diagnosis of PD in populations.
基金This work was supported by the National Natural Science Foundation of China (Nos. 61602386 and 61332014), the Natural Science Foundation of Shaanxi Province (No. 2017JQ6008), and the top university visiting foundation for excellent youth scholars of Northwestern Polytechnical University.
文摘Background: One of the most important and challenging issues in biomedicine and genomics is how to identify disease related genes. Datasets from high-throughput biotechnologies have been widely used to overcome this issue from various perspectives, e.g., epigenomics, genomics, transcriptomics, proteomics, metabolomics. At the genomic level, copy number variations (CNVs) have been recognized as critical genetic variations, which contribute significantly to genomic diversity. They have been associated with both common and complex diseases, and thus have a large influence on a variety of Mendelian and somatic genetic disorders. Results: In this review, based on a variety of complex diseases, we give an overview about the critical role of using CNVs for identifying disease related genes, and discuss on details the different high-throughput and sequencing methods applied for CNV detection. Some limitations and challenges concerning CNV are also highlighted. Conclusions: Reliable detection of CNVs will not only allow discriminating driver mutations for various diseases, but also helps to develop personalized medicine when integrating it with other genomic features.