When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
The design rainstorm profile is the basis of scientifically and rationally planning and designing urban drainage system, which can provide scientific theoretical basis and accurate design parameters for municipal cons...The design rainstorm profile is the basis of scientifically and rationally planning and designing urban drainage system, which can provide scientific theoretical basis and accurate design parameters for municipal construction, water and planning departments. In this paper, the minute-minute rainfall process data at Liuzhou National Meteorological Observation Station from 1975 to 2014 were used. Chicago method was used to analyze and study design rainstorm profile in urban district of Liuzhou, and the profiles of the rainfalls lasting for 30 , 60 , 90, 120, 150, and 180 min were obtained. The results showed that the design rainstorm profile with the same duration in each reappearance period was consistent in Liuzhou, and short-time rainfall profile roughly showed single-peak shape. The peak of each short-time design rainstorm profile was almost in 1/3 part of the whole rainfall process, and the intensity of rainfall increased with the prolonging of the recurrence period. The rainfall intensity at the peak in the same reproducing period showed 11 decrease - increase -decrease" as the duration increased, and the peak value of rainfall lasting for 120 min was the maximum.展开更多
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
基金Supported by Scientific Research and Technology Development Plan Project of Liuzhou City in 2017(2017BH30301)Self-supporting Scientific Research Project of Liuzhou Meteorological Bureau in 2016
文摘The design rainstorm profile is the basis of scientifically and rationally planning and designing urban drainage system, which can provide scientific theoretical basis and accurate design parameters for municipal construction, water and planning departments. In this paper, the minute-minute rainfall process data at Liuzhou National Meteorological Observation Station from 1975 to 2014 were used. Chicago method was used to analyze and study design rainstorm profile in urban district of Liuzhou, and the profiles of the rainfalls lasting for 30 , 60 , 90, 120, 150, and 180 min were obtained. The results showed that the design rainstorm profile with the same duration in each reappearance period was consistent in Liuzhou, and short-time rainfall profile roughly showed single-peak shape. The peak of each short-time design rainstorm profile was almost in 1/3 part of the whole rainfall process, and the intensity of rainfall increased with the prolonging of the recurrence period. The rainfall intensity at the peak in the same reproducing period showed 11 decrease - increase -decrease" as the duration increased, and the peak value of rainfall lasting for 120 min was the maximum.