A new incremental support vector machine (SVM) algorithm is proposed which is based on multiple kernel learning. Through introducing multiple kernel learning into the SVM incremental learning, large scale data set l...A new incremental support vector machine (SVM) algorithm is proposed which is based on multiple kernel learning. Through introducing multiple kernel learning into the SVM incremental learning, large scale data set learning problem can be solved effectively. Furthermore, different punishments are adopted in allusion to the training subset and the acquired support vectors, which may help to improve the performance of SVM. Simulation results indicate that the proposed algorithm can not only solve the model selection problem in SVM incremental learning, but also improve the classification or prediction precision.展开更多
Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.T...Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.Thus it is important to differentiate abnormal or unknown patterns from normal pattern with novelty detection methods.One-class support vector machine (OCSVM) that has been commonly used for novelty detection cannot deal well with large scale samples.In order to model the normal pattern of the turbopump with OCSVM and so as to monitor the condition of the turbopump,a monitoring method that integrates OCSVM with incremental clustering is presented.In this method,the incremental clustering is used for sample reduction by extracting representative vectors from a large training set.The representative vectors are supposed to distribute uniformly in the object region and fulfill the region.And training OCSVM on these representative vectors yields a novelty detector.By applying this method to the analysis of the turbopump's historical test data,it shows that the incremental clustering algorithm can extract 91 representative points from more than 36 000 training vectors,and the OCSVM detector trained on these 91 representative points can recognize spikes in vibration signals caused by different abnormal events such as vane shedding,rub-impact and sensor faults.This monitoring method does not need fault samples during training as classical recognition methods.The method resolves the learning problem of large samples and is an alternative method for condition monitoring of the LRE turbopump.展开更多
A new fault-diagnosis method to be used in batch processes based on multi-phase regression is presented to overcome the difficulty arising in the processes due to non-uniform sample data in each phase.Support vector m...A new fault-diagnosis method to be used in batch processes based on multi-phase regression is presented to overcome the difficulty arising in the processes due to non-uniform sample data in each phase.Support vector machine is first used for phase identification,and for each phase,improved artificial immune network is developed to analyze and recognize fault patterns.A new cell elimination role is proposed to enhance the incremental clustering capability of the immune network.The proposed method has been applied to glutamic acid fermentation,comparison results have indicated that the proposed approach can better classify fault samples and yield higher diagnosis precision.展开更多
The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary ...The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.展开更多
Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 ...Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.展开更多
As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorit...As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.展开更多
According to the classic Karush-Kuhn-Tucker(KKT)theorem,at every step of incremental support vector machine(SVM)learning,the newly adding sample which violates the KKT conditions will be a new support vector(SV)and mi...According to the classic Karush-Kuhn-Tucker(KKT)theorem,at every step of incremental support vector machine(SVM)learning,the newly adding sample which violates the KKT conditions will be a new support vector(SV)and migrate the old samples between SV set and non-support vector(NSV)set,and at the same time the learning model should be updated based on the SVs.However,it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs.Additionally,the learning model will be unnecessarily updated,which will not greatly increase its accuracy but decrease the training speed.Therefore,how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning.In this work,a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously.Experimental results show that the proposed algorithm can achieve good performance with high efficiency,high speed and good accuracy.展开更多
This paper analyzed the theory of incremental learning of SVM (support vector machine) and pointed out it is a shortage that the support vector optimization is only considered in present research of SVM incremental le...This paper analyzed the theory of incremental learning of SVM (support vector machine) and pointed out it is a shortage that the support vector optimization is only considered in present research of SVM incremental learning. According to the significance of keyword in training, a new incremental training method considering keyword adjusting was proposed, which eliminates the difference between incremental learning and batch learning through the keyword adjusting. The experimental results show that the improved method outperforms the method without the keyword adjusting and achieve the same precision as the batch method. Key words SVM (support vector machine) - incremental training - classification - keyword adjusting CLC number TP 18 Foundation item: Supported by the National Information Industry Development Foundation of ChinaBiography: SUN Jin-wen (1972-), male, Post-Doctoral, research direction: artificial intelligence, data mining and system integration.展开更多
Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which t...Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.展开更多
基金supported by the National Natural Science Key Foundation of China(69974021)
文摘A new incremental support vector machine (SVM) algorithm is proposed which is based on multiple kernel learning. Through introducing multiple kernel learning into the SVM incremental learning, large scale data set learning problem can be solved effectively. Furthermore, different punishments are adopted in allusion to the training subset and the acquired support vectors, which may help to improve the performance of SVM. Simulation results indicate that the proposed algorithm can not only solve the model selection problem in SVM incremental learning, but also improve the classification or prediction precision.
基金supported by National Natural Science Foundation of China (Grant No. 50675219)Hu’nan Provincial Science Committee Excellent Youth Foundation of China (Grant No. 08JJ1008)
文摘Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.Thus it is important to differentiate abnormal or unknown patterns from normal pattern with novelty detection methods.One-class support vector machine (OCSVM) that has been commonly used for novelty detection cannot deal well with large scale samples.In order to model the normal pattern of the turbopump with OCSVM and so as to monitor the condition of the turbopump,a monitoring method that integrates OCSVM with incremental clustering is presented.In this method,the incremental clustering is used for sample reduction by extracting representative vectors from a large training set.The representative vectors are supposed to distribute uniformly in the object region and fulfill the region.And training OCSVM on these representative vectors yields a novelty detector.By applying this method to the analysis of the turbopump's historical test data,it shows that the incremental clustering algorithm can extract 91 representative points from more than 36 000 training vectors,and the OCSVM detector trained on these 91 representative points can recognize spikes in vibration signals caused by different abnormal events such as vane shedding,rub-impact and sensor faults.This monitoring method does not need fault samples during training as classical recognition methods.The method resolves the learning problem of large samples and is an alternative method for condition monitoring of the LRE turbopump.
基金Sponsored by the Research Foundation of Beijing Institute of Technology (20080642001)
文摘A new fault-diagnosis method to be used in batch processes based on multi-phase regression is presented to overcome the difficulty arising in the processes due to non-uniform sample data in each phase.Support vector machine is first used for phase identification,and for each phase,improved artificial immune network is developed to analyze and recognize fault patterns.A new cell elimination role is proposed to enhance the incremental clustering capability of the immune network.The proposed method has been applied to glutamic acid fermentation,comparison results have indicated that the proposed approach can better classify fault samples and yield higher diagnosis precision.
文摘The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.
文摘Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.
基金supported by the National Natural Science Foundation of China (61074127)
文摘As the solutions of the least squares support vector regression machine (LS-SVRM) are not sparse, it leads to slow prediction speed and limits its applications. The defects of the ex- isting adaptive pruning algorithm for LS-SVRM are that the training speed is slow, and the generalization performance is not satis- factory, especially for large scale problems. Hence an improved algorithm is proposed. In order to accelerate the training speed, the pruned data point and fast leave-one-out error are employed to validate the temporary model obtained after decremental learning. The novel objective function in the termination condition which in- volves the whole constraints generated by all training data points and three pruning strategies are employed to improve the generali- zation performance. The effectiveness of the proposed algorithm is tested on six benchmark datasets. The sparse LS-SVRM model has a faster training speed and better generalization performance.
基金supported by the National Natural Science Foundation of China(Nos.U1509207 and 61325019)
文摘According to the classic Karush-Kuhn-Tucker(KKT)theorem,at every step of incremental support vector machine(SVM)learning,the newly adding sample which violates the KKT conditions will be a new support vector(SV)and migrate the old samples between SV set and non-support vector(NSV)set,and at the same time the learning model should be updated based on the SVs.However,it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs.Additionally,the learning model will be unnecessarily updated,which will not greatly increase its accuracy but decrease the training speed.Therefore,how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning.In this work,a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously.Experimental results show that the proposed algorithm can achieve good performance with high efficiency,high speed and good accuracy.
文摘This paper analyzed the theory of incremental learning of SVM (support vector machine) and pointed out it is a shortage that the support vector optimization is only considered in present research of SVM incremental learning. According to the significance of keyword in training, a new incremental training method considering keyword adjusting was proposed, which eliminates the difference between incremental learning and batch learning through the keyword adjusting. The experimental results show that the improved method outperforms the method without the keyword adjusting and achieve the same precision as the batch method. Key words SVM (support vector machine) - incremental training - classification - keyword adjusting CLC number TP 18 Foundation item: Supported by the National Information Industry Development Foundation of ChinaBiography: SUN Jin-wen (1972-), male, Post-Doctoral, research direction: artificial intelligence, data mining and system integration.
基金Supported by the National Natural Science Foundation of China (30660044)
文摘Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.
基金supported in part by the Special Fund of Jiangsu Province for the Transformation of Scientific and Technological Achievements(Nos.BA2018110,BA2021036)the Natural Science Foundation of Jiangsu Province(No.BK20211061)。
文摘对于制造承包商来说,在正式接收订单之前,为了指导报价和预测交货日期,有必要对工时(Man-hours,MH)进行评估。装配工时作为工时的重要组成部分,具有重要的实际研究意义。针对多规格、小批量生产的特点,提出了一种基于支持向量机(Support vector machine,SVM)的装配工时估算模型。除了单部件属性、装配过程和历史工时数据外,还考虑了可量化装配复杂性的最短路径长度平均值(Average of shortest path length,ASPL)作为装配MH的影响因素,并提出了基于Creo JLink三维模型的这些因素的自动计算方法。通过对几种算法的比较,选择SVM作为装配体MH建模的最优算法。将遗传算法(Genetic algorithm,GA)应用于SVM中,有利于在SVM中搜索最优参数c和g时避免了局部求解,加快了收敛速度。最后,对所提出的GA-SVM模型进行训练,并应用于雷达装置仿生腿的装配工时预测。实验结果表明,GA-SVM具有比本文其他方法更高的预测精度,整个预测过程仅需3 min左右。