Identifying pathogenetic variants and inferring their impact on protein-protein interactions sheds light on their functional consequences on diseases.Limited by the availability of experimental data on the consequence...Identifying pathogenetic variants and inferring their impact on protein-protein interactions sheds light on their functional consequences on diseases.Limited by the availability of experimental data on the consequences of protein interaction,most existing methods focus on building models to predict changes in protein binding affinity.Here,we introduced MIPPI,an end-to-end,interpretable transformer-based deep learning model that learns features directly from sequences by leveraging the interaction data from IMEx.MIPPI was specifically trained to determine the types of variant impact(increasing,decreasing,disrupting,and no effect)on protein-protein interactions.We demonstrate the accuracy of MIPPI and provide interpretation through the analysis of learned attention weights,which exhibit correlations with the amino acids interacting with the variant.Moreover,we showed the practicality of MIPPI in prioritizing de novo mutations associated with complex neurodevelopmental disorders and the potential to determine the pathogenic and driving mutations.Finally,we experimentally validated the functional impact of several variants identified in patients with such disorders.Overall,MIPPI emerges as a versatile,robust,and interpretable model,capable of effectively predicting mutation impacts on protein-protein interactions and facilitating the discovery of clinically actionable variants.展开更多
De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disor...De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disorders.Presently,a plethora of DNVs have been identified using next-generation sequencing,and many efforts have been made to understand their impact at the gene level.However,there has been little exploration of the effects at the isoform level.The brain contains a high level of alternative splicing and regulation,and exhibits a more divergent splicing program than other tissues.Therefore,it is crucial to explore variants at the transcriptional regulation level to better interpret the mechanisms underlying DNP disorders.To facilitate a better usage and improve the isoform-level interpretation of variants,we developed NeuroPsychiatric Mutation Knowledge Base(PsyMuKB).It contains a comprehensive,carefully curated list of DNVs with transcriptional and translational annotations to enable identification of isoformspecific mutations.PsyMuKB allows a flexible search of genes or variants and provides both table-based descriptions and associated visualizations,such as expression,transcript genomic structures,protein interactions,and the mutation sites mapped on the protein structures.It also provides an easy-to-use web interface,allowing users to rapidly visualize the locations and characteristics of mutations and the expression patterns of the impacted genes and isoforms.PsyMuKB thus constitutes a valuable resource for identifying tissue-specific DNVs for further functional studies of related disorders.PsyMuKB is freely accessible at http://psymukb.net.展开更多
基金supported by grants from STI 2030-Major Projects(no.2022ZD0209100)the National Natural Science Foundation of China(nos.81971292 and 82150610506)+3 种基金the Natural Science Foundation of Shanghai(no.21ZR1428600)the Medical-Engineering Cross Foundation of Shanghai Jiao Tong University(nos.YG2022ZD026 and YG2023ZD27)SJTU Trans-med Awards Research(no.20220103)the Paul K.and Diane Shumaker Endowment Fund at University of Missouri.
文摘Identifying pathogenetic variants and inferring their impact on protein-protein interactions sheds light on their functional consequences on diseases.Limited by the availability of experimental data on the consequences of protein interaction,most existing methods focus on building models to predict changes in protein binding affinity.Here,we introduced MIPPI,an end-to-end,interpretable transformer-based deep learning model that learns features directly from sequences by leveraging the interaction data from IMEx.MIPPI was specifically trained to determine the types of variant impact(increasing,decreasing,disrupting,and no effect)on protein-protein interactions.We demonstrate the accuracy of MIPPI and provide interpretation through the analysis of learned attention weights,which exhibit correlations with the amino acids interacting with the variant.Moreover,we showed the practicality of MIPPI in prioritizing de novo mutations associated with complex neurodevelopmental disorders and the potential to determine the pathogenic and driving mutations.Finally,we experimentally validated the functional impact of several variants identified in patients with such disorders.Overall,MIPPI emerges as a versatile,robust,and interpretable model,capable of effectively predicting mutation impacts on protein-protein interactions and facilitating the discovery of clinically actionable variants.
基金supported by grants from the National Key R&D Program of China(Grant No.2017YFC0909200)the National Natural Science Foundation of China(Grant Nos.81671328 and 61802057)+3 种基金Program for Professor of Special Appointment(Eastern Scholar)at Shanghai Institutions of Higher Learning(Grant No.1610000043)Innovation Research Plan supported by Shanghai Municipal Education Commission(Grant No.ZXWF082101)Science and Technology Development Plan of Jilin Province(Grant Nos.20180414006GH and 20180520028JH)the Fundamental Research Funds for the Central Universities
文摘De novo variants(DNVs)are one of the most significant contributors to severe earlyonset genetic disorders such as autism spectrum disorder,intellectual disability,and other developmental and neuropsychiatric(DNP)disorders.Presently,a plethora of DNVs have been identified using next-generation sequencing,and many efforts have been made to understand their impact at the gene level.However,there has been little exploration of the effects at the isoform level.The brain contains a high level of alternative splicing and regulation,and exhibits a more divergent splicing program than other tissues.Therefore,it is crucial to explore variants at the transcriptional regulation level to better interpret the mechanisms underlying DNP disorders.To facilitate a better usage and improve the isoform-level interpretation of variants,we developed NeuroPsychiatric Mutation Knowledge Base(PsyMuKB).It contains a comprehensive,carefully curated list of DNVs with transcriptional and translational annotations to enable identification of isoformspecific mutations.PsyMuKB allows a flexible search of genes or variants and provides both table-based descriptions and associated visualizations,such as expression,transcript genomic structures,protein interactions,and the mutation sites mapped on the protein structures.It also provides an easy-to-use web interface,allowing users to rapidly visualize the locations and characteristics of mutations and the expression patterns of the impacted genes and isoforms.PsyMuKB thus constitutes a valuable resource for identifying tissue-specific DNVs for further functional studies of related disorders.PsyMuKB is freely accessible at http://psymukb.net.