Grain yield security is a basic national policy of China,and changes in grain yield are influenced by a variety of factors,which often have a complex,non-linear relationship with each other.Therefore,this paper propos...Grain yield security is a basic national policy of China,and changes in grain yield are influenced by a variety of factors,which often have a complex,non-linear relationship with each other.Therefore,this paper proposes a Grey Relational Analysis-Adaptive Boosting-Support Vector Regression(GRA-AdaBoost-SVR)model,which can ensure the prediction accuracy of the model under small sample,improve the generalization ability,and enhance the prediction accuracy.SVR allows mapping to high-dimensional spaces using kernel functions,good for solving nonlinear problems.Grain yield datasets generally have small sample sizes and many features,making SVR a promising application for grain yield datasets.However,the SVR algorithm’s own problems with the selection of parameters and kernel functions make the model less generalizable.Therefore,the Adaptive Boosting(AdaBoost)algorithm can be used.Using the SVR algorithm as a training method for base learners in the AdaBoost algorithm.Effectively address the generalization capability problem in SVR algorithms.In addition,to address the problem of sensitivity to anomalous samples in the AdaBoost algorithm,the GRA method is used to extract influence factors with higher correlation and reduce the number of anomalous samples.Finally,applying the GRA-AdaBoost-SVR model to grain yield forecasting in China.Experiments were conducted to verify the correctness of the model and to compare the effectiveness of several traditional models applied to the grain yield data.The results show that the GRA-AdaBoost-SVR algorithm improves the prediction accuracy,the model is smoother,and confirms that the model possesses better prediction performance and better generalization ability.展开更多
Cardiotocography(CTG)represents the fetus’s health inside the womb during labor.However,assessment of its readings can be a highly subjective process depending on the expertise of the obstetrician.Digital signals fro...Cardiotocography(CTG)represents the fetus’s health inside the womb during labor.However,assessment of its readings can be a highly subjective process depending on the expertise of the obstetrician.Digital signals from fetal monitors acquire parameters(i.e.,fetal heart rate,contractions,acceleration).Objective:This paper aims to classify the CTG readings containing imbalanced healthy,suspected,and pathological fetus readings.Method:We perform two sets of experiments.Firstly,we employ five classifiers:Random Forest(RF),Adaptive Boosting(AdaBoost),Categorical Boosting(CatBoost),Extreme Gradient Boosting(XGBoost),and Light Gradient Boosting Machine(LGBM)without over-sampling to classify CTG readings into three categories:healthy,suspected,and pathological.Secondly,we employ an ensemble of the above-described classifiers with the oversamplingmethod.We use a random over-sampling technique to balance CTG records to train the ensemble models.We use 3602 CTG readings to train the ensemble classifiers and 1201 records to evaluate them.The outcomes of these classifiers are then fed into the soft voting classifier to obtain the most accurate results.Results:Each classifier evaluates accuracy,Precision,Recall,F1-scores,and Area Under the Receiver Operating Curve(AUROC)values.Results reveal that the XGBoost,LGBM,and CatBoost classifiers yielded 99%accuracy.Conclusion:Using ensemble classifiers over a balanced CTG dataset improves the detection accuracy compared to the previous studies and our first experiment.A soft voting classifier then eliminates the weakness of one individual classifier to yield superior performance of the overall model.展开更多
基金This work was support in part by Research on Key Technologies of Intelligent Decision-Making for Food Big Data under Grant 2018A01038in part by the National Science Fund for Youth of Hubei Province of China under Grant 2018CFB408+2 种基金in part by the Natural Science Foundation of Hubei Province of China under Grant 2015CFA061in part by the National Nature Science Foundation of China under Grant 61272278in part by the Major Technical Innovation Projects of Hubei Province under Grant 2018ABA099。
文摘Grain yield security is a basic national policy of China,and changes in grain yield are influenced by a variety of factors,which often have a complex,non-linear relationship with each other.Therefore,this paper proposes a Grey Relational Analysis-Adaptive Boosting-Support Vector Regression(GRA-AdaBoost-SVR)model,which can ensure the prediction accuracy of the model under small sample,improve the generalization ability,and enhance the prediction accuracy.SVR allows mapping to high-dimensional spaces using kernel functions,good for solving nonlinear problems.Grain yield datasets generally have small sample sizes and many features,making SVR a promising application for grain yield datasets.However,the SVR algorithm’s own problems with the selection of parameters and kernel functions make the model less generalizable.Therefore,the Adaptive Boosting(AdaBoost)algorithm can be used.Using the SVR algorithm as a training method for base learners in the AdaBoost algorithm.Effectively address the generalization capability problem in SVR algorithms.In addition,to address the problem of sensitivity to anomalous samples in the AdaBoost algorithm,the GRA method is used to extract influence factors with higher correlation and reduce the number of anomalous samples.Finally,applying the GRA-AdaBoost-SVR model to grain yield forecasting in China.Experiments were conducted to verify the correctness of the model and to compare the effectiveness of several traditional models applied to the grain yield data.The results show that the GRA-AdaBoost-SVR algorithm improves the prediction accuracy,the model is smoother,and confirms that the model possesses better prediction performance and better generalization ability.
文摘Cardiotocography(CTG)represents the fetus’s health inside the womb during labor.However,assessment of its readings can be a highly subjective process depending on the expertise of the obstetrician.Digital signals from fetal monitors acquire parameters(i.e.,fetal heart rate,contractions,acceleration).Objective:This paper aims to classify the CTG readings containing imbalanced healthy,suspected,and pathological fetus readings.Method:We perform two sets of experiments.Firstly,we employ five classifiers:Random Forest(RF),Adaptive Boosting(AdaBoost),Categorical Boosting(CatBoost),Extreme Gradient Boosting(XGBoost),and Light Gradient Boosting Machine(LGBM)without over-sampling to classify CTG readings into three categories:healthy,suspected,and pathological.Secondly,we employ an ensemble of the above-described classifiers with the oversamplingmethod.We use a random over-sampling technique to balance CTG records to train the ensemble models.We use 3602 CTG readings to train the ensemble classifiers and 1201 records to evaluate them.The outcomes of these classifiers are then fed into the soft voting classifier to obtain the most accurate results.Results:Each classifier evaluates accuracy,Precision,Recall,F1-scores,and Area Under the Receiver Operating Curve(AUROC)values.Results reveal that the XGBoost,LGBM,and CatBoost classifiers yielded 99%accuracy.Conclusion:Using ensemble classifiers over a balanced CTG dataset improves the detection accuracy compared to the previous studies and our first experiment.A soft voting classifier then eliminates the weakness of one individual classifier to yield superior performance of the overall model.