Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsi...Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.展开更多
Machine learning(ML)techniques and algorithms have been successfully and widely used in various areas including software engineering tasks.Like other software projects,bugs are also common in ML projects and libraries...Machine learning(ML)techniques and algorithms have been successfully and widely used in various areas including software engineering tasks.Like other software projects,bugs are also common in ML projects and libraries.In order to more deeply understand the features related to bug fixing in ML projects,we conduct an empirical study with 939 bugs from five ML projects by manually examining the bug categories,fixing patterns,fixing scale,fixing duration,and types of maintenance.The results show that(1)there are commonly seven types of bugs in ML programs;(2)twelve fixing patterns are typically used to fix the bugs in ML programs;(3)68.80%of the patches belong to micro-scale-fix and small-scale-fix;(4)66.77%of the bugs in ML programs can be fixed within one month;(5)45.90%of the bug fixes belong to corrective activity from the perspective of software maintenance.Moreover,we perform a questionnaire survey and send them to developers or users of ML projects to validate the results in our empirical study.The results of our empirical study are basically consistent with the feedback from developers.The findings from the empirical study provide useful guidance and insights for developers and users to effectively detect and fix bugs in MLprojects.展开更多
In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) ...In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.展开更多
基金supported by the National Natural Science Foundation of China(6177202062202433+4 种基金621723716227242262036010)the Natural Science Foundation of Henan Province(22100002)the Postdoctoral Research Grant in Henan Province(202103111)。
文摘Least squares projection twin support vector machine(LSPTSVM)has faster computing speed than classical least squares support vector machine(LSSVM).However,LSPTSVM is sensitive to outliers and its solution lacks sparsity.Therefore,it is difficult for LSPTSVM to process large-scale datasets with outliers.In this paper,we propose a robust LSPTSVM model(called R-LSPTSVM)by applying truncated least squares loss function.The robustness of R-LSPTSVM is proved from a weighted perspective.Furthermore,we obtain the sparse solution of R-LSPTSVM by using the pivoting Cholesky factorization method in primal space.Finally,the sparse R-LSPTSVM algorithm(SR-LSPTSVM)is proposed.Experimental results show that SR-LSPTSVM is insensitive to outliers and can deal with large-scale datasets fastly.
基金This work was supported partially by the National Natural Science Foundation of China(Grant Nos.61872312,61972335,61472344,61611540347,61402396 and 61662021)partially by the Open Funds of State Key Laboratory for Novel Software Technology of Nanjing University(KFKT2020B15 and KFKT2020B16)+3 种基金partially by the Jiangsu“333”Project,partially by the Six Talent Peaks Project in Jiangsu Province(RJFW-053)partially by the Natural Science Foundation of Jiangsu(BK20181353)partially by the Yangzhou city-Yangzhou University Science and Technology Cooperation Fund Project(YZU201803),by the CERNET Innovation Project(NGII20180607)partially by the Yangzhou University Top-level Talents Support Program(2019).
文摘Machine learning(ML)techniques and algorithms have been successfully and widely used in various areas including software engineering tasks.Like other software projects,bugs are also common in ML projects and libraries.In order to more deeply understand the features related to bug fixing in ML projects,we conduct an empirical study with 939 bugs from five ML projects by manually examining the bug categories,fixing patterns,fixing scale,fixing duration,and types of maintenance.The results show that(1)there are commonly seven types of bugs in ML programs;(2)twelve fixing patterns are typically used to fix the bugs in ML programs;(3)68.80%of the patches belong to micro-scale-fix and small-scale-fix;(4)66.77%of the bugs in ML programs can be fixed within one month;(5)45.90%of the bug fixes belong to corrective activity from the perspective of software maintenance.Moreover,we perform a questionnaire survey and send them to developers or users of ML projects to validate the results in our empirical study.The results of our empirical study are basically consistent with the feedback from developers.The findings from the empirical study provide useful guidance and insights for developers and users to effectively detect and fix bugs in MLprojects.
文摘In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.