Because most ensemble learning algorithms use the centralized model, and the training instances must be centralized on a single station, it is difficult to centralize the training data on a station. A distributed ense...Because most ensemble learning algorithms use the centralized model, and the training instances must be centralized on a single station, it is difficult to centralize the training data on a station. A distributed ensemble learning algorithm is proposed which has two kinds of weight genes of instances that denote the global distribution and the local distribution. Instead of the repeated sampling method in the standard ensemble learning, non-balance sampling from each station is used to train the base classifier set of each station. The concept of the effective nearby region for local integration classifier is proposed, and is used for the dynamic integration method of multiple classifiers in distributed environment. The experiments show that the ensemble learning algorithm in distributed environment proposed could reduce the time of training the base classifiers effectively, and ensure the classify performance is as same as the centralized learning method.展开更多
This paper proposed an algorithm in which the maximum probability and the weighted average strategy were used for the combination of member classifiers. Using parallel computing, we test the algorithm on a China-Brazi...This paper proposed an algorithm in which the maximum probability and the weighted average strategy were used for the combination of member classifiers. Using parallel computing, we test the algorithm on a China-Brazil Earth Resources Satellite (CBERS) image for land cover classification. The results show that using three computers in parallel can reduce the classification time by 30%, as compared with using only one computer with a dual core processor. The accuracy of the final image is 93.34%, and Kappa is 0.92. Multiple classifier combination can enhance the precision of the image classification, and parallel computing can increase the speed of calculation so that it becomes possible to process remote sensing images with high efficiency and accuracy.展开更多
基金the Natural Science Foundation of Shaan’xi Province (2005F51).
文摘Because most ensemble learning algorithms use the centralized model, and the training instances must be centralized on a single station, it is difficult to centralize the training data on a station. A distributed ensemble learning algorithm is proposed which has two kinds of weight genes of instances that denote the global distribution and the local distribution. Instead of the repeated sampling method in the standard ensemble learning, non-balance sampling from each station is used to train the base classifier set of each station. The concept of the effective nearby region for local integration classifier is proposed, and is used for the dynamic integration method of multiple classifiers in distributed environment. The experiments show that the ensemble learning algorithm in distributed environment proposed could reduce the time of training the base classifiers effectively, and ensure the classify performance is as same as the centralized learning method.
基金Supported by the National Natural Science Foundation of China (70873117)
文摘This paper proposed an algorithm in which the maximum probability and the weighted average strategy were used for the combination of member classifiers. Using parallel computing, we test the algorithm on a China-Brazil Earth Resources Satellite (CBERS) image for land cover classification. The results show that using three computers in parallel can reduce the classification time by 30%, as compared with using only one computer with a dual core processor. The accuracy of the final image is 93.34%, and Kappa is 0.92. Multiple classifier combination can enhance the precision of the image classification, and parallel computing can increase the speed of calculation so that it becomes possible to process remote sensing images with high efficiency and accuracy.