There are many new and potential drug targets in G protein-coupled receptors(GPCRs)without sufficient ligand associations,and accurately predicting and interpreting ligand bioactivities is vital for screening and opti...There are many new and potential drug targets in G protein-coupled receptors(GPCRs)without sufficient ligand associations,and accurately predicting and interpreting ligand bioactivities is vital for screening and optimizing hit compounds targeting these GPCRs.To efficiently address the lack of labeled training samples,we proposed a multi-task regression learning with incoherent sparse and low-rank patterns(MTR-ISLR)to model ligand bioactivities and identify their key substructures associated with these GPCRs targets.That is,MTR-ISLR intends to enhance the performance and interpretability of models under a small size of available training data by introducing homologous GPCR tasks.Meanwhile,the low-rank constraint term encourages to catch the underlying relationship among homologous GPCR tasks for greater model generalization,and the entry-wise sparse regularization term ensures to recognize essential discriminative substructures from each task for explanative modeling.We examined MTR-ISLR on a set of 31 important human GPCRs datasets from 9 subfamilies,each with less than 400 ligand associations.The results show that MTR-ISLR reaches better performance when compared with traditional single-task learning,deep multi-task learning and multi-task learning with joint feature learning-based models on most cases,where MTR-ISLR obtains an average improvement of 7%in correlation coefficient(r2)and 12%in root mean square error(RMSE)against the runner-up predictors.The MTR-ISLR web server appends freely all source codes and data for academic usages.^(1))展开更多
基金supported in part by the National Natural Science Foundation of China(Grant Nos.61872198,61971216,81771478,81973512)the Basic Research Program of Science and Technology Department of Jiangsu Province(BK20201378)+1 种基金the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province(18KJB416005)the Natural Science Foundation of Nanjing University of Posts and Telecommunications(NY218092).
文摘There are many new and potential drug targets in G protein-coupled receptors(GPCRs)without sufficient ligand associations,and accurately predicting and interpreting ligand bioactivities is vital for screening and optimizing hit compounds targeting these GPCRs.To efficiently address the lack of labeled training samples,we proposed a multi-task regression learning with incoherent sparse and low-rank patterns(MTR-ISLR)to model ligand bioactivities and identify their key substructures associated with these GPCRs targets.That is,MTR-ISLR intends to enhance the performance and interpretability of models under a small size of available training data by introducing homologous GPCR tasks.Meanwhile,the low-rank constraint term encourages to catch the underlying relationship among homologous GPCR tasks for greater model generalization,and the entry-wise sparse regularization term ensures to recognize essential discriminative substructures from each task for explanative modeling.We examined MTR-ISLR on a set of 31 important human GPCRs datasets from 9 subfamilies,each with less than 400 ligand associations.The results show that MTR-ISLR reaches better performance when compared with traditional single-task learning,deep multi-task learning and multi-task learning with joint feature learning-based models on most cases,where MTR-ISLR obtains an average improvement of 7%in correlation coefficient(r2)and 12%in root mean square error(RMSE)against the runner-up predictors.The MTR-ISLR web server appends freely all source codes and data for academic usages.^(1))