Standard support vector machines (SVMs) train- ing algorithms have O(l3) computational and O(l2) space complexities, where l is the training set size. It is thus com- /putationally infeasible on very large data ...Standard support vector machines (SVMs) train- ing algorithms have O(l3) computational and O(l2) space complexities, where l is the training set size. It is thus com- /putationally infeasible on very large data sets.To alleviate the Computational burden in SVM training, we propose an algo- rithm to train SVMs on a bound vectors set that is extracted based on Fisher projection. For linear separate problems, we use linear Fisher discriminant to compute the projection line, while for non-linear separate problems, we use kernel Fisher discriminant to compute the projection line. For each case, we select a certain ratio samples whose projections are adja- cent to those of the other class as bound vectors. Theoretical analysis shows that the proposed algorithm is with low com- putational and space complexities.Extensive experiments on several classification benchmarks demonstrate the effective- ness of our approach.展开更多
The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics h...The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics have put forward numerous default prediction models.However,how to use multiple models to enhance overall performance on default prediction remains untouched.In this paper,a parametric and non-parametric combination model is proposed.Firstly,binary logistic regression model(BLRM),support vector machine(SVM),and decision tree(DT) are used respectively to establish models with relatively stable and high performance.Secondly,in order to make further improvement to the overall performance,a combination model using the method of multiple discriminant analysis(MDA) is constructed.In this way,the coverage rate of the combination model is greatly improved,and the risk of miscarriage is effectively reduced.Lastly,the results of the combination model are analyzed by using the K-means clustering,and the clustering distribution is consistent with a normal distribution.The results show that the combination model based on parametric and non-parametric can effectively enhance the overall performance on default prediction.展开更多
Excessive pesticide residues on Chinese cabbage will be harmful to people’s health.Therefore,an identification system was designed for qualitative analysis of lambda-cyhalothrin residues on Chinese cabbage leaves.In ...Excessive pesticide residues on Chinese cabbage will be harmful to people’s health.Therefore,an identification system was designed for qualitative analysis of lambda-cyhalothrin residues on Chinese cabbage leaves.In order to extract discriminant information from mid-infrared(MIR)spectra of Chinese cabbage effectively,fuzzy uncorrelated discriminant vector(FUDV)analysis was proposed by introducing the fuzzy set theory into uncorrelated discriminant vector(UDV)analysis.In this system,the Cary 630 FTIR spectrometer was used to scan four samples of Chinese cabbage with different concentrations of lambda-cyhalothrin.The MIR spectra were preprocessed by standard normal variable(SNV)and Savitzky-Golay smoothing(SG).Next,the high-dimensional MIR spectra were processed for dimension reduction by principal component analysis(PCA).Furthermore,UDV,FUDV,and some other discriminant analysis algorithms were used for feature extraction,respectively.Finally,the K-nearest neighbor(KNN)classifier was employed to classify the data.The experimental results showed that when FUDV was used as the feature extraction algorithm,the identification system reached the maximum classification accuracy of 100%.The results indicated that FUDV combined with MIR spectroscopy was an effective method to identify lambda-cyhalothrin residues on Chinese cabbage.展开更多
基金This work was sponsored by the National Natural Sci- ence Foundation of China (Grant Nos. 61370083, 61073043, 61073041 and 61370086), the National Research Foundation for the Doctoral Program of Higher Education of China (20112304110011 and 20122304110012), the Natural Science Foundation of Heilongjiang Province (F200901), and the Harbin Outstanding Academic Leader Foundation of Heilongjiang Province of China (2011RFXXG015).
文摘Standard support vector machines (SVMs) train- ing algorithms have O(l3) computational and O(l2) space complexities, where l is the training set size. It is thus com- /putationally infeasible on very large data sets.To alleviate the Computational burden in SVM training, we propose an algo- rithm to train SVMs on a bound vectors set that is extracted based on Fisher projection. For linear separate problems, we use linear Fisher discriminant to compute the projection line, while for non-linear separate problems, we use kernel Fisher discriminant to compute the projection line. For each case, we select a certain ratio samples whose projections are adja- cent to those of the other class as bound vectors. Theoretical analysis shows that the proposed algorithm is with low com- putational and space complexities.Extensive experiments on several classification benchmarks demonstrate the effective- ness of our approach.
基金supported by the National Natural Science Foundation of China Key Project under Grant No.70933003the National Natural Science Foundation of China under Grant Nos.70871109 and 71203247
文摘The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics have put forward numerous default prediction models.However,how to use multiple models to enhance overall performance on default prediction remains untouched.In this paper,a parametric and non-parametric combination model is proposed.Firstly,binary logistic regression model(BLRM),support vector machine(SVM),and decision tree(DT) are used respectively to establish models with relatively stable and high performance.Secondly,in order to make further improvement to the overall performance,a combination model using the method of multiple discriminant analysis(MDA) is constructed.In this way,the coverage rate of the combination model is greatly improved,and the risk of miscarriage is effectively reduced.Lastly,the results of the combination model are analyzed by using the K-means clustering,and the clustering distribution is consistent with a normal distribution.The results show that the combination model based on parametric and non-parametric can effectively enhance the overall performance on default prediction.
基金The authors sincerely acknowledge that this work was financially supported by the National Natural Science Foundation of China(Grant No.31471413)the Undergraduate Scientific Research Project of Jiangsu University(Grant No.17A274)the University Natural Science Research Project of Anhui Province(Grant No.KJ2019A1129).
文摘Excessive pesticide residues on Chinese cabbage will be harmful to people’s health.Therefore,an identification system was designed for qualitative analysis of lambda-cyhalothrin residues on Chinese cabbage leaves.In order to extract discriminant information from mid-infrared(MIR)spectra of Chinese cabbage effectively,fuzzy uncorrelated discriminant vector(FUDV)analysis was proposed by introducing the fuzzy set theory into uncorrelated discriminant vector(UDV)analysis.In this system,the Cary 630 FTIR spectrometer was used to scan four samples of Chinese cabbage with different concentrations of lambda-cyhalothrin.The MIR spectra were preprocessed by standard normal variable(SNV)and Savitzky-Golay smoothing(SG).Next,the high-dimensional MIR spectra were processed for dimension reduction by principal component analysis(PCA).Furthermore,UDV,FUDV,and some other discriminant analysis algorithms were used for feature extraction,respectively.Finally,the K-nearest neighbor(KNN)classifier was employed to classify the data.The experimental results showed that when FUDV was used as the feature extraction algorithm,the identification system reached the maximum classification accuracy of 100%.The results indicated that FUDV combined with MIR spectroscopy was an effective method to identify lambda-cyhalothrin residues on Chinese cabbage.