In this paper, we consider the regularized learning schemes based on l1-regularizer and pinball loss in a data dependent hypothesis space. The target is the error analysis for the quantile regression learning. There i...In this paper, we consider the regularized learning schemes based on l1-regularizer and pinball loss in a data dependent hypothesis space. The target is the error analysis for the quantile regression learning. There is no regularized condition with the kernel function, excepting continuity and boundness. The graph-based semi-supervised algorithm leads to an extra error term called manifold error. Part of new error bounds and convergence rates are exactly derived with the techniques consisting of l1-empirical covering number and boundness decomposition.展开更多
Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technol...Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technologies,prediction and identification of PPIs have become a research hotspot in proteomics.In this study,we propose a new prediction pipeline for PPIs based on gradient tree boosting(GTB).First,the initial feature vector is extracted by fusing pseudo amino acid composition(Pse AAC),pseudo position-specific scoring matrix(Pse PSSM),reduced sequence and index-vectors(RSIV),and autocorrelation descriptor(AD).Second,to remove redundancy and noise,we employ L1-regularized logistic regression(L1-RLR)to select an optimal feature subset.Finally,GTB-PPI model is constructed.Five-fold cross-validation showed that GTB-PPI achieved the accuracies of 95.15% and 90.47% on Saccharomyces cerevisiae and Helicobacter pylori datasets,respectively.In addition,GTB-PPI could be applied to predict the independent test datasets for Caenorhabditis elegans,Escherichia coli,Homo sapiens,and Mus musculus,the one-core PPI network for CD9,and the crossover PPI network for the Wnt-related signaling pathways.The results show that GTB-PPI can significantly improve accuracy of PPI prediction.The code and datasets of GTB-PPI can be downloaded from https://github.com/QUST-AIBBDRC/GTB-PPI/.展开更多
文摘In this paper, we consider the regularized learning schemes based on l1-regularizer and pinball loss in a data dependent hypothesis space. The target is the error analysis for the quantile regression learning. There is no regularized condition with the kernel function, excepting continuity and boundness. The graph-based semi-supervised algorithm leads to an extra error term called manifold error. Part of new error bounds and convergence rates are exactly derived with the techniques consisting of l1-empirical covering number and boundness decomposition.
基金supported by the National Natural Science Foundation of China(Grant No.61863010)the Key Research and Development Program of Shandong Province of China(Grant No.2019GGX101001)the Natural Science Foundation of Shandong Province of China(Grant No.ZR2018MC007)。
文摘Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technologies,prediction and identification of PPIs have become a research hotspot in proteomics.In this study,we propose a new prediction pipeline for PPIs based on gradient tree boosting(GTB).First,the initial feature vector is extracted by fusing pseudo amino acid composition(Pse AAC),pseudo position-specific scoring matrix(Pse PSSM),reduced sequence and index-vectors(RSIV),and autocorrelation descriptor(AD).Second,to remove redundancy and noise,we employ L1-regularized logistic regression(L1-RLR)to select an optimal feature subset.Finally,GTB-PPI model is constructed.Five-fold cross-validation showed that GTB-PPI achieved the accuracies of 95.15% and 90.47% on Saccharomyces cerevisiae and Helicobacter pylori datasets,respectively.In addition,GTB-PPI could be applied to predict the independent test datasets for Caenorhabditis elegans,Escherichia coli,Homo sapiens,and Mus musculus,the one-core PPI network for CD9,and the crossover PPI network for the Wnt-related signaling pathways.The results show that GTB-PPI can significantly improve accuracy of PPI prediction.The code and datasets of GTB-PPI can be downloaded from https://github.com/QUST-AIBBDRC/GTB-PPI/.