As an important protein acylation modification,lysine succinylation(Ksucc)is involved in diverse biological processes,and participates in human tumorigenesis.Here,we collected 26,243 non-redundant known Ksucc sites fr...As an important protein acylation modification,lysine succinylation(Ksucc)is involved in diverse biological processes,and participates in human tumorigenesis.Here,we collected 26,243 non-redundant known Ksucc sites from 13 species as the benchmark data set,combined 10 types of informative features,and implemented a hybrid-learning architecture by integrating deep-learning and conventional machine-learning algorithms into a single framework.We constructed a new tool named HybridSucc,which achieved area under curve(AUC)values of 0.885 and 0.952 for general and human-specific prediction of Ksucc sites,respectively.In comparison,the accuracy of HybridSucc was 17.84%-50.62%better than that of other existing tools.Using HybridSucc,we conducted a proteome-wide prediction and prioritized 370 cancer mutations that change Ksucc states of 218 important proteins,including PKM2,SHMT2,and IDH2.We not only developed a high-profile tool for predicting Ksucc sites,but also generated useful candidates for further experimental consideration.The online service of HybridSucc can be freely accessed for academic research at http://hybridsucc.biocuckoo.org/.展开更多
The integration, analysis and visualization of the big omics data are critical for addressing a broad spectrum of biological questions. One of the most frequently conducted procedures is enrichment analysis, which sta...The integration, analysis and visualization of the big omics data are critical for addressing a broad spectrum of biological questions. One of the most frequently conducted procedures is enrichment analysis, which statistically tests whether individual functional an- notations of Gent Ontology (GO) or Kyoto Encyclopedia of Genes and Genomes (KEGG) are significantly over-or under-represented in an "interesting" gene or protein list against the reference set (Tavazoie et al., 1999).展开更多
基金supported by the Special Project on Precision Medicine under the National Key R&D Program of China(Grant Nos.2017YFC0906600 and 2018YFC0910500)the National Natural Science Foundation of China(Grant Nos.31671360,31801095,and 31601067)+4 种基金Fundamental Research Funds for the Central Universities(Grant Nos.2019kfyRCPY043 and 2017KFXKJC001)the National Program for Support of Top-Notch Young ProfessionalsChangjiang Scholars Program of Chinaprogram for HUST Academic Frontier Youth TeamChina Postdoctoral Science Foundation(Grant No.2018M632870)
文摘As an important protein acylation modification,lysine succinylation(Ksucc)is involved in diverse biological processes,and participates in human tumorigenesis.Here,we collected 26,243 non-redundant known Ksucc sites from 13 species as the benchmark data set,combined 10 types of informative features,and implemented a hybrid-learning architecture by integrating deep-learning and conventional machine-learning algorithms into a single framework.We constructed a new tool named HybridSucc,which achieved area under curve(AUC)values of 0.885 and 0.952 for general and human-specific prediction of Ksucc sites,respectively.In comparison,the accuracy of HybridSucc was 17.84%-50.62%better than that of other existing tools.Using HybridSucc,we conducted a proteome-wide prediction and prioritized 370 cancer mutations that change Ksucc states of 218 important proteins,including PKM2,SHMT2,and IDH2.We not only developed a high-profile tool for predicting Ksucc sites,but also generated useful candidates for further experimental consideration.The online service of HybridSucc can be freely accessed for academic research at http://hybridsucc.biocuckoo.org/.
基金supported by the Special Project on Precision Medicine under the National Key R&D Program (2017YFC0906600)the Natural Science Foundation of China (No. 31671360)
文摘The integration, analysis and visualization of the big omics data are critical for addressing a broad spectrum of biological questions. One of the most frequently conducted procedures is enrichment analysis, which statistically tests whether individual functional an- notations of Gent Ontology (GO) or Kyoto Encyclopedia of Genes and Genomes (KEGG) are significantly over-or under-represented in an "interesting" gene or protein list against the reference set (Tavazoie et al., 1999).