To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching te...To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching template to represent the user' s searching intention through positive and negative training. By using the prior probabilities in the template, the supported probability and anti- supported probability of each text in the text library can be estimated for discrimination. The search- ing result can be ranked according to similarities between retrieved texts and the template. The com- plexity of DCTR is close to term frequency and mversed document frequency (TF-IDF). Its distin- guishing ability to confusable texts could be advanced and the performance of the result would be im- proved with increasing of training times.展开更多
The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics h...The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics have put forward numerous default prediction models.However,how to use multiple models to enhance overall performance on default prediction remains untouched.In this paper,a parametric and non-parametric combination model is proposed.Firstly,binary logistic regression model(BLRM),support vector machine(SVM),and decision tree(DT) are used respectively to establish models with relatively stable and high performance.Secondly,in order to make further improvement to the overall performance,a combination model using the method of multiple discriminant analysis(MDA) is constructed.In this way,the coverage rate of the combination model is greatly improved,and the risk of miscarriage is effectively reduced.Lastly,the results of the combination model are analyzed by using the K-means clustering,and the clustering distribution is consistent with a normal distribution.The results show that the combination model based on parametric and non-parametric can effectively enhance the overall performance on default prediction.展开更多
文摘To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching template to represent the user' s searching intention through positive and negative training. By using the prior probabilities in the template, the supported probability and anti- supported probability of each text in the text library can be estimated for discrimination. The search- ing result can be ranked according to similarities between retrieved texts and the template. The com- plexity of DCTR is close to term frequency and mversed document frequency (TF-IDF). Its distin- guishing ability to confusable texts could be advanced and the performance of the result would be im- proved with increasing of training times.
基金supported by the National Natural Science Foundation of China Key Project under Grant No.70933003the National Natural Science Foundation of China under Grant Nos.70871109 and 71203247
文摘The probability of default(PD) is the key element in the New Basel Capital Accord and the most essential factor to financial institutions' risk management.To obtain good PD estimation,practitioners and academics have put forward numerous default prediction models.However,how to use multiple models to enhance overall performance on default prediction remains untouched.In this paper,a parametric and non-parametric combination model is proposed.Firstly,binary logistic regression model(BLRM),support vector machine(SVM),and decision tree(DT) are used respectively to establish models with relatively stable and high performance.Secondly,in order to make further improvement to the overall performance,a combination model using the method of multiple discriminant analysis(MDA) is constructed.In this way,the coverage rate of the combination model is greatly improved,and the risk of miscarriage is effectively reduced.Lastly,the results of the combination model are analyzed by using the K-means clustering,and the clustering distribution is consistent with a normal distribution.The results show that the combination model based on parametric and non-parametric can effectively enhance the overall performance on default prediction.