Support vector machine (SVM) was introduced to analyze the reliability of the implicit performance function, which is difficult to implement by the classical methods such as the first order reliability method (FORM...Support vector machine (SVM) was introduced to analyze the reliability of the implicit performance function, which is difficult to implement by the classical methods such as the first order reliability method (FORM) and the Monte Carlo simulation (MCS). As a classification method where the underlying structural risk minimization inference rule is employed, SVM possesses excellent learning capacity with a small amount of information and good capability of generalization over the complete data. Hence, two approaches, i.e., SVM-based FORM and SVM-based MCS, were presented for the structural reliability analysis of the implicit limit state function. Compared to the conventional response surface method (RSM) and the artificial neural network (ANN), which are widely used to replace the implicit state function for alleviating the computation cost, the more important advantages of SVM are that it can approximate the implicit function with higher precision and better generalization under the small amount of information and avoid the "curse of dimensionality". The SVM-based reliability approaches can approximate the actual performance function over the complete sampling data with the decreased number of the implicit performance function analysis (usually finite element analysis), and the computational precision can satisfy the engineering requirement, which are demonstrated by illustrations.展开更多
Aiming at the reliability analysis of small sample data or implicit structural function,a novel structural reliability analysis model based on support vector machine(SVM)and neural network direct integration method(DN...Aiming at the reliability analysis of small sample data or implicit structural function,a novel structural reliability analysis model based on support vector machine(SVM)and neural network direct integration method(DNN)is proposed.Firstly,SVM with good small sample learning ability is used to train small sample data,fit structural performance functions and establish regular integration regions.Secondly,DNN is approximated the integral function to achieve multiple integration in the integration region.Finally,structural reliability was obtained by DNN.Numerical examples are investigated to demonstrate the effectiveness of the present method,which provides a feasible way for the structural reliability analysis.展开更多
The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on...The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.展开更多
The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary ...The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.展开更多
Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 ...Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.展开更多
A three-descriptor quantitative structure-property relationship (QSPR) model, based on the support vector machine (SVM) algorithm, was constructed to predict the glass transition temperatures (Tgs) ofpolyarylate...A three-descriptor quantitative structure-property relationship (QSPR) model, based on the support vector machine (SVM) algorithm, was constructed to predict the glass transition temperatures (Tgs) ofpolyarylates with complex structures. A total of 50 polyarylates were randomly divided into three sets, viz., the training set (30 polymers), validation set (10 polymers) and prediction set (10 polymers). By adjusting various parameters by trial and error, the final optimum SVM model based on Austin Model 1 (AM1) calculation is a polynomial kernel with the parameters C of 100, ε of 1.00E-05 and d of 2. The root-mean-square (RMS) errors obtained from the training set, validation set and prediction set are 19.4, 12.8 and 15.5 K, respectively. Research results show that the proposed SVM model has better statistical quality than the previous models. Thus, applying the SVM algorithm to predict Tgs of polymers is feasible.展开更多
Machine learning algorithms operating in an unsupervised fashion has emerged as promising tools for detecting structural damage in an automated fashion.Its essence relies on selecting appropriate features to train the...Machine learning algorithms operating in an unsupervised fashion has emerged as promising tools for detecting structural damage in an automated fashion.Its essence relies on selecting appropriate features to train the model using the reference data set collected from the healthy structure and employing the trained model to identify outlier conditions representing the damaged state.In this paper,the coefficients and the residuals of the autoregressive model with exogenous input created using only the measured output signals are extracted as damage features.These features obtained at the baseline state for each sensor cluster are then utilized to train the one class support vector machine,an unsupervised classifier generating a decision function using only patterns belonging to this baseline state.Structural damage,once detected by the trained machine,a damage index based on comparison of the residuals between the trained class and the outlier state is implemented for localizing damage.The two-step damage assessment framework is first implemented on an eight degree-of-freedom numerical model with the effects of measurement noise integrated.Subsequently,vibration data collected from a one-story one-bay reinforced concrete frame inflicted with progressive levels of damage have been utilized to verify the accuracy and robustness of the proposed methodology.展开更多
Traditionally,optical microscopy is used to visualize the morphological features of pathogenic bacteria,of which the features are further used for the detection and ident ification of the bacteria.However,due to the r...Traditionally,optical microscopy is used to visualize the morphological features of pathogenic bacteria,of which the features are further used for the detection and ident ification of the bacteria.However,due to the resolution limitation of conventional optical microscopy as well as the lack of standard pattern library for bacteria identification,the ffectiveness of this optical microscopy-based method is limited.Here,we reported a pilot study on a combined use of Structured Illumination Microscopy(SIM)with machine learning for rapid bacteria identification.After applying machine learning to the SIM image datasets from three model bacteria(including Escherichia coli,Mycobacterium smegmatis,and Pseudomonas aeruginosa),we obtained a classifcation accuracy of up to 98%.This study points out a promising possibility for rapid bacterial identification by morphological features.展开更多
The support vector classification (SVC) was employed to make a model for classification of antifungal activities of 1-(1H-1,2,4-triazole-l-yl)-2-(2,4-difluorophenyl)-3-substituted-2-propanols triazole derivative...The support vector classification (SVC) was employed to make a model for classification of antifungal activities of 1-(1H-1,2,4-triazole-l-yl)-2-(2,4-difluorophenyl)-3-substituted-2-propanols triazole derivatives. The compounds with high antifungal activities and those with low antifungal activities were compared on the basis of the following molecular descriptors: net atomic charge on the atom N connecting with R, dipole moment and heat of formation, By using the SVC, a mathematical model was constructed, which can predict the antifungal activities of the triazole derivatives, with an accuracy of 91% on the basis of the leave-one-out cross-validation (LOOCV) test, The results indicate that the performance of the SVC model can exceed that of the principal component analysis (PCA) and K-Nearest Neighbor (KNN) models for this real world data set.展开更多
Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which t...Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.展开更多
Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimum cost.Mac...Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimum cost.Machine learning techniques have been increasingly applied to load forecasting. A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas shortterm load forecasting. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization in conventional regression techniques. Using a data set with 2 years load values we developed prediction model using SVM to obtain 31 days load predictions. The results on city natural gas short-term load forecasting show that SVM provides better prediction accuracy than neural network. The software package natural gas pipeline networks simulation and load forecasting (NGPNSLF) based on support vector regression prediction has been developed, which has also been applied in practice.展开更多
Although many works have been done to construct prediction models on yarn processing quality,the relation between spinning variables and yarn properties has not been established conclusively so far.Support vector mach...Although many works have been done to construct prediction models on yarn processing quality,the relation between spinning variables and yarn properties has not been established conclusively so far.Support vector machines(SVMs),based on statistical learning theory,are gaining applications in the areas of machine learning and pattern recognition because of the high accuracy and good generalization capability.This study briefly introduces the SVM regression algorithms,and presents the SVM based system architecture for predicting yarn properties.Model selection which amounts to search in hyper-parameter space is performed for study of suitable parameters with grid-research method.Experimental results have been compared with those of artificial neural network(ANN)models.The investigation indicates that in the small data sets and real-life production,SVM models are capable of remaining the stability of predictive accuracy,and more suitable for noisy and dynamic spinning process.展开更多
针对大型复杂结构极限状态方程一般难以显式表达的特点,提出了基于最小二乘支持向量机(the leastsquare support vector machine,LS-SVM)的结构可靠度评估方法。该方法采用均匀抽样法抽取随机变量样本,应用确定性有限元求解器进行数值...针对大型复杂结构极限状态方程一般难以显式表达的特点,提出了基于最小二乘支持向量机(the leastsquare support vector machine,LS-SVM)的结构可靠度评估方法。该方法采用均匀抽样法抽取随机变量样本,应用确定性有限元求解器进行数值计算。将样本数据进行训练,利用最小二乘支持向量机建立随机变量与结构响应之间的非线性映射关系,模拟结构极限状态方程。通过计算极限状态方程值和偏导数值,求解优化问题,计算结构可靠指标。结果表明,该方法能够评估隐式极限状态方程的结构可靠度,具有较高的计算精度和较好的计算效率。展开更多
基金Project supported by the National Natural Science Foundation of China (No.10572117)the National Astronautics Science Foundation of China (Nos.N3CH0502 and N5CH0001)Program for New Century Excellent Talent of Ministry of Education of China (No.NCET-05-0868)
文摘Support vector machine (SVM) was introduced to analyze the reliability of the implicit performance function, which is difficult to implement by the classical methods such as the first order reliability method (FORM) and the Monte Carlo simulation (MCS). As a classification method where the underlying structural risk minimization inference rule is employed, SVM possesses excellent learning capacity with a small amount of information and good capability of generalization over the complete data. Hence, two approaches, i.e., SVM-based FORM and SVM-based MCS, were presented for the structural reliability analysis of the implicit limit state function. Compared to the conventional response surface method (RSM) and the artificial neural network (ANN), which are widely used to replace the implicit state function for alleviating the computation cost, the more important advantages of SVM are that it can approximate the implicit function with higher precision and better generalization under the small amount of information and avoid the "curse of dimensionality". The SVM-based reliability approaches can approximate the actual performance function over the complete sampling data with the decreased number of the implicit performance function analysis (usually finite element analysis), and the computational precision can satisfy the engineering requirement, which are demonstrated by illustrations.
基金National Natural Science Foundation of China(Nos.11262014,11962021 and 51965051)Inner Mongolia Natural Science Foundation,China(No.2019MS05064)+1 种基金Inner Mongolia Earthquake Administration Director Fund Project,China(No.2019YB06)Inner Mongolia University of Technology Foundation,China(No.2020015)。
文摘Aiming at the reliability analysis of small sample data or implicit structural function,a novel structural reliability analysis model based on support vector machine(SVM)and neural network direct integration method(DNN)is proposed.Firstly,SVM with good small sample learning ability is used to train small sample data,fit structural performance functions and establish regular integration regions.Secondly,DNN is approximated the integral function to achieve multiple integration in the integration region.Finally,structural reliability was obtained by DNN.Numerical examples are investigated to demonstrate the effectiveness of the present method,which provides a feasible way for the structural reliability analysis.
文摘The purpose of this paper is to present a novel way to building quantitative structure-property relationship(QSPR) models for predicting the gas-to-benzene solvation enthalpy(ΔHSolv) of 158 organic compounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptors were calculated for each compounds using dragon package. The variable selection technique of enhanced replacement method(ERM) was employed to select optimal subset of descriptors. Our investigation reveals that the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact and that ERM method is unable to model the solvation enthalpy accurately. The standard error value of prediction set for support vector machine(SVM) is 1.681 kJ ? mol^(-1) while it is 4.624 kJ ? mol^(-1) for ERM. The results established that the calculated ΔHSolvvalues by SVM were in good agreement with the experimental ones, and the performances of the SVM models were superior to those obtained by ERM one. This indicates that SVM can be used as an alternative modeling tool for QSPR studies.
文摘The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.
文摘Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.
基金supported by the Open Project Program of Key Laboratory of Environmentally Friendly Chemistry and Applications of Ministry of Education,China (No.10HJYH06)
文摘A three-descriptor quantitative structure-property relationship (QSPR) model, based on the support vector machine (SVM) algorithm, was constructed to predict the glass transition temperatures (Tgs) ofpolyarylates with complex structures. A total of 50 polyarylates were randomly divided into three sets, viz., the training set (30 polymers), validation set (10 polymers) and prediction set (10 polymers). By adjusting various parameters by trial and error, the final optimum SVM model based on Austin Model 1 (AM1) calculation is a polynomial kernel with the parameters C of 100, ε of 1.00E-05 and d of 2. The root-mean-square (RMS) errors obtained from the training set, validation set and prediction set are 19.4, 12.8 and 15.5 K, respectively. Research results show that the proposed SVM model has better statistical quality than the previous models. Thus, applying the SVM algorithm to predict Tgs of polymers is feasible.
基金funding provided by the Scientific and Technological Research Council of Türkiye(TÜBİTAK).
文摘Machine learning algorithms operating in an unsupervised fashion has emerged as promising tools for detecting structural damage in an automated fashion.Its essence relies on selecting appropriate features to train the model using the reference data set collected from the healthy structure and employing the trained model to identify outlier conditions representing the damaged state.In this paper,the coefficients and the residuals of the autoregressive model with exogenous input created using only the measured output signals are extracted as damage features.These features obtained at the baseline state for each sensor cluster are then utilized to train the one class support vector machine,an unsupervised classifier generating a decision function using only patterns belonging to this baseline state.Structural damage,once detected by the trained machine,a damage index based on comparison of the residuals between the trained class and the outlier state is implemented for localizing damage.The two-step damage assessment framework is first implemented on an eight degree-of-freedom numerical model with the effects of measurement noise integrated.Subsequently,vibration data collected from a one-story one-bay reinforced concrete frame inflicted with progressive levels of damage have been utilized to verify the accuracy and robustness of the proposed methodology.
基金supported by the National Key Research and Development Program of China(Grant No.2017-YFD0500303)the National Natural Science Foundation of China(Grant Nos.31371106,91640105)+1 种基金the China Agriculture Research System(No.CARS-36)the Huazhong Agricultural University Scienti¯c and Technological Self-innovation Foundation(Program No.52204-13002).
文摘Traditionally,optical microscopy is used to visualize the morphological features of pathogenic bacteria,of which the features are further used for the detection and ident ification of the bacteria.However,due to the resolution limitation of conventional optical microscopy as well as the lack of standard pattern library for bacteria identification,the ffectiveness of this optical microscopy-based method is limited.Here,we reported a pilot study on a combined use of Structured Illumination Microscopy(SIM)with machine learning for rapid bacteria identification.After applying machine learning to the SIM image datasets from three model bacteria(including Escherichia coli,Mycobacterium smegmatis,and Pseudomonas aeruginosa),we obtained a classifcation accuracy of up to 98%.This study points out a promising possibility for rapid bacterial identification by morphological features.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.20373040, 20503015)
文摘The support vector classification (SVC) was employed to make a model for classification of antifungal activities of 1-(1H-1,2,4-triazole-l-yl)-2-(2,4-difluorophenyl)-3-substituted-2-propanols triazole derivatives. The compounds with high antifungal activities and those with low antifungal activities were compared on the basis of the following molecular descriptors: net atomic charge on the atom N connecting with R, dipole moment and heat of formation, By using the SVC, a mathematical model was constructed, which can predict the antifungal activities of the triazole derivatives, with an accuracy of 91% on the basis of the leave-one-out cross-validation (LOOCV) test, The results indicate that the performance of the SVC model can exceed that of the principal component analysis (PCA) and K-Nearest Neighbor (KNN) models for this real world data set.
基金Supported by the National Natural Science Foundation of China (30660044)
文摘Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.
文摘Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimum cost.Machine learning techniques have been increasingly applied to load forecasting. A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas shortterm load forecasting. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization in conventional regression techniques. Using a data set with 2 years load values we developed prediction model using SVM to obtain 31 days load predictions. The results on city natural gas short-term load forecasting show that SVM provides better prediction accuracy than neural network. The software package natural gas pipeline networks simulation and load forecasting (NGPNSLF) based on support vector regression prediction has been developed, which has also been applied in practice.
基金National Science Foundation and Technology Innovation Fund of P.R.China(No.70371040and02LJ-14-05-01)
文摘Although many works have been done to construct prediction models on yarn processing quality,the relation between spinning variables and yarn properties has not been established conclusively so far.Support vector machines(SVMs),based on statistical learning theory,are gaining applications in the areas of machine learning and pattern recognition because of the high accuracy and good generalization capability.This study briefly introduces the SVM regression algorithms,and presents the SVM based system architecture for predicting yarn properties.Model selection which amounts to search in hyper-parameter space is performed for study of suitable parameters with grid-research method.Experimental results have been compared with those of artificial neural network(ANN)models.The investigation indicates that in the small data sets and real-life production,SVM models are capable of remaining the stability of predictive accuracy,and more suitable for noisy and dynamic spinning process.
文摘针对大型复杂结构极限状态方程一般难以显式表达的特点,提出了基于最小二乘支持向量机(the leastsquare support vector machine,LS-SVM)的结构可靠度评估方法。该方法采用均匀抽样法抽取随机变量样本,应用确定性有限元求解器进行数值计算。将样本数据进行训练,利用最小二乘支持向量机建立随机变量与结构响应之间的非线性映射关系,模拟结构极限状态方程。通过计算极限状态方程值和偏导数值,求解优化问题,计算结构可靠指标。结果表明,该方法能够评估隐式极限状态方程的结构可靠度,具有较高的计算精度和较好的计算效率。